Results tagged “analysis” from manAmplified

Cleaning Squid Logs

|

Two quick unix commands that will clean a Squid log file for import into R. Very useful when you need to fine tune your cache strategy.

Quick note on using box plots to show arrival rates during specified interval periods.

Cache Simulation with R

|
Below are a few R functions I use to evaluate and tune our Squid caches, which we use in accelerator mode. The functions work by taking a real stream of requests from either an application or squid log. From there the arrival rate can be determined at every second. Or a filter can be applied and the cache hit ratio or arrival rate re-determined. Or better yet, the series can be filtered a number of times with different filter values to determine the best value for a given filter.

Bayesian Inferencing

|

If you are reading this, and decide to learn more about this, and love this, you probably want to play with this.

Temporal Locality

|

If you work with distributed systems, HTTP referenceable services, and specifically caching systems, you probably should check out On the Intrinsic Locality Properties of Web Reference Streams.

Engineering Statistics

|

The Engineering Statistics Handbook, provided by NIST, is available in HTML and PDF by chapters.

Statistics Resources

|

Some online courses for statistical data analysis with references to R.

Applying Little's Law

|

When developers are asked to load test a system, most will start up a million threads to create a load on the server showing that it will eventually become un-responsive, but they have no numbers that allow them to create a load profile so they can properly plan for peak times in a production environment.