ASIC/FPGA Design and Verification Out Source Services
This page presents a script, which is used by this site, to extract,
what pages have been accessed (from apache2 access log).
To view the main page of this
This page explains the basics parts of the PERL script.
In the first part of the script, I build the date in the very same format
used by the apache2 web server.
While the day and year are simple, the month is taken from an array:
- my @a_month = ();
- push(@a_month, "Jan");
- push(@a_month, "Feb");
- ...
- my $date_tmp=`date +%m`; chomp($date_tmp);
- my $date=$a_month[$date_tmp-1];
- $date_tmp=`date +%d`; chomp($date_tmp);
- $date=$date_tmp . "/" . $date;
- $date_tmp=`date +%y`; chomp($date_tmp);
- if( length($date_tmp) == 2 ) {$date_tmp="20" . $date_tmp;}
- $date=$date . "/" . $date_tmp;
The next part is the hash. The hash key is the HTML file and its content
is the number of times the page was visited today.
- my %hash_cnt = ();
- my $fp_val="";
- ...
- if( $line =~ /[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*.*(my_web\/.*\.html).*/ ) {
- $t1=$1;
- #filter two html entries
- $search_ix=index($t1, "html ");
- if( $search_ix >= 0 ) {
- $t1=substr($t1, 0, $search_ix+4);
- }
- #filter non my page
- $search_ix=index($t1, "www\.google");
- if( $search_ix < 0 ) {
- $fp_cnt=$t1;
- $hash_cnt{ $fp_cnt }++;
The data is finally written out the hash to create an HTML file
HTML report,
using a reverse
(b <=> a and not a <=>b)
sort to show the largest value first
for my $key ( sort {$hash_cnt{$b} <=> $hash_cnt{$a}} keys %hash_cnt ) {
Note: that in many case I use the PERL function index to find a
string in a string. This is faster than the regular expression syntax:
if( $line =~ //) ...
The script is also capable to filter out based on an IP list. For instance
entries, which start with my router's IP at home or work.
my $filter_ip_1="";
my $entry_ip="";
- my @filter_ip_a = ();
- push(@filter_ip_a, ""); #home
- push(@filter_ip_a, ""); #broadcom
- ...
- #filter my views, which come from my router
- FILT_L : foreach $searchAix (@filter_ip_a) {
- $search_ix=index($entry_ip, $searchAix);
- last FILT_L if ($search_ix >= 0);
- }
- if( $search_ix < 0 ) {
The script also print a IP report. Only the most popular are printed (above
30 percent of the total). The most popular ones are printed with a darker
color than others:
IP report 100% 50 088% 44 086% 43