The Best Cyprus Community

Skip to content


A question for GR!

Feel free to talk about anything that you want.

Re: A question for GR!

Postby Get Real! » Mon Apr 23, 2018 8:19 am

Sotos wrote:Google makes billions of profits every year. If you could get something even remotely close by just spending a year and a few thousands then you would do it. The reason you don't want to spend that amount of time and money is because you know the result will be crap, nobody will use it, and your time and money will be totally wasted.

Yes, typical of Sotos to always move the goalposts to...

Write the software AND surpass Google’s popularity/profits!

:lol: :lol: :lol:
User avatar
Get Real!
Forum Addict
Forum Addict
 
Posts: 48333
Joined: Mon Feb 26, 2007 12:25 am
Location: Nicosia

Re: A question for GR!

Postby Sotos » Mon Apr 23, 2018 8:20 am

Get Real! wrote:Lots of bots paid my site a visit today including:

* Unknown robot (identified by 'bot*')
* Unknown robot (identified by hit on 'robots.txt')
* Unknown robot (identified by empty user agent string)
* Yandex bot
* Unknown robot (identified by 'spider')
* Google AdSense
* Googlebot
* Unknown robot (identified by 'robot')
* SeznamBot
* MJ12bot
* Unknown robot (identified by 'crawl')
* Python-urllib
* Mail.ru bot
* Sogou Spider

Do you think all the above authors have a Google-like data center to aid them? :lol:

Some are just Uni students slaving away from some dormitory for Christ's sake... :roll:


There is a huge difference between having a bot and having a search engine. Having a bot is the easiest thing. You only need a few lines of code to write a bot that crawls the web. If you check your logs then I can make it so the next time you check your logs you will find that "I am Get Real Bot" has visited your site, and I don't even need to write a bot to do that. If I spend a few minutes I can have the "I am Get Real Bot" crawl the web and visit lots of websites every day. That is NOT a search engine, it is not even a search engine bot, let alone being a search engine that can compete with Google.
User avatar
Sotos
Leading Contributor
Leading Contributor
 
Posts: 11357
Joined: Wed Aug 17, 2005 2:50 am

Re: A question for GR!

Postby Get Real! » Mon Apr 23, 2018 8:24 am

Sotos wrote:
Get Real! wrote:Lots of bots paid my site a visit today including:

* Unknown robot (identified by 'bot*')
* Unknown robot (identified by hit on 'robots.txt')
* Unknown robot (identified by empty user agent string)
* Yandex bot
* Unknown robot (identified by 'spider')
* Google AdSense
* Googlebot
* Unknown robot (identified by 'robot')
* SeznamBot
* MJ12bot
* Unknown robot (identified by 'crawl')
* Python-urllib
* Mail.ru bot
* Sogou Spider

Do you think all the above authors have a Google-like data center to aid them? :lol:

Some are just Uni students slaving away from some dormitory for Christ's sake... :roll:


There is a huge difference between having a bot and having a search engine. Having a bot is the easiest thing. You only need a few lines of code to write a bot that crawls the web. If you check your logs then I can make it so the next time you check your logs you will find that "I am Get Real Bot" has visited your site, and I don't even need to write a bot to do that. If I spend a few minutes I can have the "I am Get Real Bot" crawl the web and visit lots of websites every day. That is NOT a search engine, it is not even a search engine bot, let alone being a search engine that can compete with Google.

No Sotos, Awstats and similar visitor statistics software, can tell the difference between a real bot and a user who just modified his UAS to "I am Get Real Bot" . :lol:
User avatar
Get Real!
Forum Addict
Forum Addict
 
Posts: 48333
Joined: Mon Feb 26, 2007 12:25 am
Location: Nicosia

Re: A question for GR!

Postby Sotos » Mon Apr 23, 2018 8:26 am

Get Real! wrote:
Sotos wrote:Google makes billions of profits every year. If you could get something even remotely close by just spending a year and a few thousands then you would do it. The reason you don't want to spend that amount of time and money is because you know the result will be crap, nobody will use it, and your time and money will be totally wasted.

Yes, typical of Sotos to always move the goalposts to...

Write the software AND surpass Google’s popularity/profits!

:lol: :lol: :lol:


:lol: If you could have a search engine that is better than Google's then you could certainly "get something even remotely close" (I din't say surpass) to Google's profits and popularity, which would offer you tons of profits if you only had to invest a year of your own time and 2-3 thousands to create it. Remember: Your product should be OBJECTIVELY better. You being delusional and believing that it is better doesn't count, and you wouldn't make a cent out of it.
User avatar
Sotos
Leading Contributor
Leading Contributor
 
Posts: 11357
Joined: Wed Aug 17, 2005 2:50 am

Re: A question for GR!

Postby Sotos » Mon Apr 23, 2018 8:34 am

Get Real! wrote:
Sotos wrote:
Get Real! wrote:Lots of bots paid my site a visit today including:

* Unknown robot (identified by 'bot*')
* Unknown robot (identified by hit on 'robots.txt')
* Unknown robot (identified by empty user agent string)
* Yandex bot
* Unknown robot (identified by 'spider')
* Google AdSense
* Googlebot
* Unknown robot (identified by 'robot')
* SeznamBot
* MJ12bot
* Unknown robot (identified by 'crawl')
* Python-urllib
* Mail.ru bot
* Sogou Spider

Do you think all the above authors have a Google-like data center to aid them? :lol:

Some are just Uni students slaving away from some dormitory for Christ's sake... :roll:


There is a huge difference between having a bot and having a search engine. Having a bot is the easiest thing. You only need a few lines of code to write a bot that crawls the web. If you check your logs then I can make it so the next time you check your logs you will find that "I am Get Real Bot" has visited your site, and I don't even need to write a bot to do that. If I spend a few minutes I can have the "I am Get Real Bot" crawl the web and visit lots of websites every day. That is NOT a search engine, it is not even a search engine bot, let alone being a search engine that can compete with Google.

No Sotos, Awstats and similar visitor statistics software, can tell the difference between a real bot and a user who just modified his UAS to "I am Get Real Bot" . :lol:


Awstats can do shit ;) It even tells you how it "knows" that the visit was made by a bot: It is either a known bot (e.g. Google AdSense) or it is an unknown user agent whose name contains things like "Bot", "Spider", "Crawler" etc.
User avatar
Sotos
Leading Contributor
Leading Contributor
 
Posts: 11357
Joined: Wed Aug 17, 2005 2:50 am

Re: A question for GR!

Postby Get Real! » Mon Apr 23, 2018 8:38 am

Sotos wrote:Awstats can do shit ;) It even tells you how it "knows" that the visit was made by a bot: It is either a known bot (e.g. Google AdSense) or it is an unknown user agent whose name contains things like "Bot", "Spider", "Crawler" etc.

:? A bot’s request header is always different to that of a browser.

Giauto oussou gamo to geraton mou… :lol:
User avatar
Get Real!
Forum Addict
Forum Addict
 
Posts: 48333
Joined: Mon Feb 26, 2007 12:25 am
Location: Nicosia

Re: A question for GR!

Postby Get Real! » Mon Apr 23, 2018 9:42 am

Sotos, we can easily prove this by adding this simple test to our website...


if(navigator.userAgent=='I am Get Real Bot'){alert('Sotos just paid me a visit!');}


The alert message above will ALWAYS FIRE when you visit me using one of your browsers (which has set this UAS), but a bot will NEVER fire it even if it has the same UAS because it operates differently:..


For example, a "header retrieving bot" can just...


var Bot=new XMLHttpRequest(), S='';

Bot.open('GET','http://mysite.com/',true);

Bot.send();

Bot.onreadystatechange=function(){

if(this.readyState==this.HEADERS_RECEIVED){

S+=Bot.getAllResponseHeaders();

}
};



Our bot above grabs the response headers from sites and appends them to a string... a useless bot but it proves a point of how to fetch stuff without loading the site.

Did I load the website into a browser window? Nope!

Can the "trap" javascript at the top of my site catch it? Nope... the bot hasn't even loaded it!

So to summarize things... human users always request index.htm, whereas bots can just make specific HTTP requests to fetch things.

Of course, a bot can also retrieve the entire website script undetected without even loading it if it wanted to... which I can demonstrate.

That's it basically...
User avatar
Get Real!
Forum Addict
Forum Addict
 
Posts: 48333
Joined: Mon Feb 26, 2007 12:25 am
Location: Nicosia

Re: A question for GR!

Postby Sotos » Mon Apr 23, 2018 9:53 am

Get Real! wrote:
Sotos wrote:Awstats can do shit ;) It even tells you how it "knows" that the visit was made by a bot: It is either a known bot (e.g. Google AdSense) or it is an unknown user agent whose name contains things like "Bot", "Spider", "Crawler" etc.

:? A bot’s request header is always different to that of a browser.

Giauto oussou gamo to geraton mou… :lol:


You are clueless so shut up. I am working now so I will not bother with your crap, but in the evening I will send such requests to your website and you should see them in your stats the next day (because AWstats usually updates every some hours, sometimes up to 24 hours). And I wasn't planning to use a browser, but Postman (https://www.getpostman.com/) and writing a simple nodejs app that makes requests is a matter of minutes also. AWstats is not sophisticated at all, and if anything its effort is to exclude from your "Viewed traffic" stats the bots, and not the other way around.
User avatar
Sotos
Leading Contributor
Leading Contributor
 
Posts: 11357
Joined: Wed Aug 17, 2005 2:50 am

Re: A question for GR!

Postby Get Real! » Mon Apr 23, 2018 9:58 am

Sotos wrote:
Get Real! wrote:
Sotos wrote:Awstats can do shit ;) It even tells you how it "knows" that the visit was made by a bot: It is either a known bot (e.g. Google AdSense) or it is an unknown user agent whose name contains things like "Bot", "Spider", "Crawler" etc.

:? A bot’s request header is always different to that of a browser.

Giauto oussou gamo to geraton mou… :lol:

You are clueless so shut up. I am working now so I will not bother with your crap, but in the evening I will send such requests to your website and you should see them in your stats the next day (because AWstats usually updates every some hours, sometimes up to 24 hours). And I wasn't planning to use a browser, but Postman (https://www.getpostman.com/) and writing a simple nodejs app that makes requests is a matter of minutes also. AWstats is not sophisticated at all, and if anything its effort is to exclude from your "Viewed traffic" stats the bots, and not the other way around.

:? I don’t care about defending Awstats… with node.js you can easily set up a server to mimic whatever you want.

You don’t need “postman” or anyone else’s help Sotos…

I can demonstrate in node.js with 5 lines of code.
User avatar
Get Real!
Forum Addict
Forum Addict
 
Posts: 48333
Joined: Mon Feb 26, 2007 12:25 am
Location: Nicosia

Re: A question for GR!

Postby Sotos » Mon Apr 23, 2018 10:05 am

Get Real! wrote:Sotos, we can easily prove this by adding this simple test to our website...


if(navigator.userAgent=='I am Get Real Bot'){alert('Sotos just paid me a visit!');}


The alert message above will ALWAYS FIRE when you visit me using one of your browsers (which has set this UAS), but a bot will NEVER fire it even if it has the same UAS because it operates differently:..


For example, a "header retrieving bot" can just...


var Bot=new XMLHttpRequest(), S='';

Bot.open('GET','http://mysite.com/',true);

Bot.send();

Bot.onreadystatechange=function(){

if(this.readyState==this.HEADERS_RECEIVED){

S+=Bot.getAllResponseHeaders();

}
};



Our bot above grabs the response headers from sites and appends them to a string... a useless bot but it proves a point of how to fetch stuff without loading the site.

Did I load the website into a browser window? Nope!

Can the "trap" javascript at the top of my site catch it? Nope... the bot hasn't even loaded it!

So to summarize things... human users always request index.htm, whereas bots can just make specific HTTP requests to fetch things.

Of course, a bot can also retrieve the entire website script undetected without even loading it if it wanted to... which I can demonstrate.

That's it basically...


Bots do not run javascript. It just makes a request for a file. What runs the javascript is a javascript engine. AWStats doesn't use javascript (Google Analytics does) so for AWStats it doesn't matter if JS on your pages was executed or not, it just looks at the http requests as recorder on the logs of the server. If it sees a user agent that it doesn't recognize that has the word "bot", "spider" etc in it (and not keywords that usually appear in Browser user agents) then it will categorize it as "Unknown robot"
User avatar
Sotos
Leading Contributor
Leading Contributor
 
Posts: 11357
Joined: Wed Aug 17, 2005 2:50 am

PreviousNext

Return to General Chat

Who is online

Users browsing this forum: No registered users and 0 guests