doc: add instructions for reproducing benchmarks

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
This commit is contained in:
Alexander Bezzubov 2018-12-26 22:09:27 +01:00
parent ec96325d13
commit db21cd6557
No known key found for this signature in database
GPG Key ID: 8039F5787EFCD05D

@ -217,13 +217,27 @@ Golang's regexp engine being slower than Ruby's, which uses the [oniguruma](http
You can find scripts and additional information (like software and hardware used
and benchmarks' results per sample file) in [*benchmarks*](https://github.com/src-d/enry/blob/master/benchmarks) directory.
If you want to reproduce the same benchmarks you can run:
benchmarks/run.sh
### Benchmark Dependencies
As benchmarks depend on Ruby and Github-Linguist gem make sure you have:
- Ruby (e.g using [`rbenv`](https://github.com/rbenv/rbenv)), [`bundler`](https://bundler.io/) installed
- Docker
- [native dependencies](https://github.com/github/linguist/#dependencies) installed
- Build the gem `cd .linguist && bundle install && rake build_gem && cd -`
- Install it `gem install --no-rdoc --no-ri --local .linguist/github-linguist-*.gem`
from the root's project directory and it'll run benchmarks for enry and linguist, parse the output, create csv files and create a histogram (you must have installed [gnuplot](http://gnuplot.info) in your system to get the histogram).
This can take some time, so to run local benchmarks for a quick check you can either:
### How to reproduce current results
If you want to reproduce the same benchmarks as reported above:
- Make sure all [dependencies](#benchmark-dependencies) are installed
- Install [gnuplot](http://gnuplot.info) (in order to plot the histogram)
- Run `$ benchmarks/run.sh`
It will run the benchmarks for enry and linguist, parse the output, create csv files and plot the histogram. This takes some time.
### Quick
To run quicker benchmarks you can either:
make benchmarks
@ -231,7 +245,7 @@ to get average times for the main detection function and strategies for the whol
make benchmarks-samples
if you want to see measures by sample file.
if you want to see measures per sample file.
Why Enry?