as a webmaster, the crown web Xiaobian feel a little understanding of website log analysis, the site will be more secure. Web logs can be downloaded from FTP or servers. After downloading, we can analyze the IP segment of the corresponding source to determine whether the IP segment belongs to a real spider.

below, the crown web Xiaobian on the crown web site log analysis.

first, Xiao Bian downloaded the 8-2 Web log

from the server

second, open the web log


through the picture above, we can see, this is very messy, look a little tired, and it is difficult to analyze in the end what IP visited our site, which pages have been caught by spiders. It is a mess, so we can make a simple process, difficult to change. The small book is converted to XLS in TXT format. In this way, we can analyze them in rows and columns. After conversion, we can parse them out through the columns so that we can show them in a detailed and regular manner.


through the above, we can clearly analyze customer IP source, what specific page of the article, which column was collected, we all can see, even can see website by crawling state. But from here, still can not be clearly summed up in the end there are several IP section visit our site. Thus, we should use the EXECL data filtering similar summary by IP for example, and 220.181.108.* IP C address before paragraph is the same, so we can C the same period together. Specific operations, select data – auto filter – customize – include – fill in the IP segments to be grouped together.


from here, we can see directly what types of IP have visited our site, so we need to know whether these IP types are good or bad.

as a webmaster, to know what IP section is good, and which IP section is not good, so you can try to avoid site problems continue to derive. The usual IP paragraph reads

a 220.181.108.*ip section of Baidu spider (lift right spider):

this section of IP visit your site, so that your site is very healthy, the number of visits a day more, indicating that Baidu spider is very friendly to your site. Once the page was crawled, it was released very short.

two, 123>

