[nycbug-talk] Africa Network Traffic

Bjorn Nelson o_sleep at belovedarctos.com
Sat Oct 6 14:44:47 EDT 2007


On Oct 5, 2007, at 5:13 PM, Alex Pilosov wrote:

> On Fri, 5 Oct 2007, Isaac Levy wrote:
>>>> Where to start?  I'm drawing a blank on this...
>>> What is the internet?
>> Right :)
> To clarify, if you missed the point. You started with a very vague
> question. You need to define each word in the "how much of the  
> internet's
> CONTENT originates in Affica, West Africa, or even Ghana", and then  
> you'll
> realize how complex (or simple, depending on definition) your  
> problem is.

I think Ike is just looking for correlations he can use to get an  
understanding of network traffic in Africa, he probably has a good  
understanding of how the internet works.  Although the internet is  
going to have various levels that obfuscate this, it's probably not  
normally distributed.  I think the biggest trick to this project will  
be making things discrete.  It's like when working on grid computing  
and you are trying to figure out why certain jobs run slow or trying  
to estimate how long a job will run.  You have the data from the  
hardware but it's trying to find which metric correlates with your  
job times, and this won't be 100% either, but you need find an  
imperfect match that you are comfortable with and then put your faith  
in that metric and make decisions based on it.  So for Ike or Ike's  
friend, he is going to need to keep his question vague and general to  
make sure he stays on track, he will get imperfect data and  
correlations but he will need to find one's that he will be  
comfortable with, i.e. they can be explained to correlate to a  
certain percentage of error and the rare situations where those  
metrics might be wrong, they are also explained.  If they can get a  
few metrics that agree to some level, or can be explained when they  
don't agree then he will have good research.

So how does he get web cache data?  Does akamai show their stats,  
albeit in reverse because you want their customer (source) rather  
their customer's customer (us)?  Are they even public?  Are there  
companies that are and can this data be compared with what percent of  
the market in Africa they actually have?

The actual research data is best had from the science and technology  
library at 34/Madison or if you know someone at Baruch or CUNY, they  
have access to research databases like Factiva or Lexis-Nexis and a  
great library with business research on 24th/Lex.  I am sure some of  
the High Falutin Columbia guys can toot their own horn as well.


