llm-ls/crates/testbed/README.md

# testbed

testbed is a framework to evaluate the efficiency of the completions generated by llm-ls and the underlying model.

It works by first making holes in files, then generates completions for a given list of repositories and finally runs the associated unit tests.

The result is a table containing a line for each repository and the total with the average percentage of successful unit tests.

Here is a simplified pseudo code algorithm for testbed:
```
read the repositories file
read the holes file(s)
for each repository
  spawn a thread
  setup the repository
  for each hole
    make the hole as specified by the file
    generate completions
    build the code
    run the tests
print results
```

## Running testbed

Before running testbed you will need to create a repositories file. It is a YAML file containing a list of repositories to test.

It also contains the parameters to the `llm-ls/getCompletions` request.

Repositories can either be sourced from your local storage or Github.

You can check the repositories files at the root of the crate to see the full structure.

### Generating holes

Before running testbed, you will need to generate a holes file for each repository. To generate a holes file run testbed with the `-g` option. You can specify the number of holes to make with `-n <number>`. It will take the list of repositories in your YAML file and create the associated files at the defined path.

### Setup

testbed runs completions for each repository in parallel. It will first create a temporary directory, then copy or download the repository's source files to that location and finally run the setup commands.

Setup commands are useful to install dependencies.

```yaml
setup_commands:
  - ["python3", ["-m", "venv", "huggingface_hub-venv"]]
  - ["huggingface_hub-venv/bin/python3", ["-m", "pip", "install", ".[dev]"]]
```

### Build

Before running the tests, testbed will run a build command to check if the code is valid.

To configure the commands, you can do the following:

```yaml
build_command: huggingface_hub-venv/bin/python3
build_args: ["-m", "compileall", "-q", "."]
```

### Runners

testbed supports two test runners at the moment:
- cargo
- pytest

To configure your runner, you have the following options:
```yaml
runner: pytest
runner_command: huggingface_hub-venv/bin/python3
runner_extra_args:
  - "-k"
  - "_utils_ and not _utils_cache and not _utils_http and not paginate and not git"
```

You can override the runners command with `runner_command`, which is useful when setting up dependencies in a venv.

## References

testbed was inspired by [human-eval](https://github.com/openai/human-eval) and [RepoEval](https://arxiv.org/abs/2303.12570).
feat: `testbed` (#39) 2023-11-06 15:26:37 -05:00			`# testbed`

			`testbed is a framework to evaluate the efficiency of the completions generated by llm-ls and the underlying model.`

			`It works by first making holes in files, then generates completions for a given list of repositories and finally runs the associated unit tests.`

			`The result is a table containing a line for each repository and the total with the average percentage of successful unit tests.`

			`Here is a simplified pseudo code algorithm for testbed:`
			```
			`read the repositories file`
			`read the holes file(s)`
			`for each repository`
			`spawn a thread`
			`setup the repository`
			`for each hole`
			`make the hole as specified by the file`
			`generate completions`
			`build the code`
			`run the tests`
			`print results`
			```

			`## Running testbed`

			`Before running testbed you will need to create a repositories file. It is a YAML file containing a list of repositories to test.`

			It also contains the parameters to the `llm-ls/getCompletions` request.

			`Repositories can either be sourced from your local storage or Github.`

			`You can check the repositories files at the root of the crate to see the full structure.`

			`### Generating holes`

			Before running testbed, you will need to generate a holes file for each repository. To generate a holes file run testbed with the `-g` option. You can specify the number of holes to make with `-n <number>`. It will take the list of repositories in your YAML file and create the associated files at the defined path.

			`### Setup`

			`testbed runs completions for each repository in parallel. It will first create a temporary directory, then copy or download the repository's source files to that location and finally run the setup commands.`

			`Setup commands are useful to install dependencies.`

			```yaml
			`setup_commands:`
			`- ["python3", ["-m", "venv", "huggingface_hub-venv"]]`
			`- ["huggingface_hub-venv/bin/python3", ["-m", "pip", "install", ".[dev]"]]`
			```

			`### Build`

			`Before running the tests, testbed will run a build command to check if the code is valid.`

			`To configure the commands, you can do the following:`

			```yaml
			`build_command: huggingface_hub-venv/bin/python3`
			`build_args: ["-m", "compileall", "-q", "."]`
			```

			`### Runners`

			`testbed supports two test runners at the moment:`
			`- cargo`
			`- pytest`

			`To configure your runner, you have the following options:`
			```yaml
			`runner: pytest`
			`runner_command: huggingface_hub-venv/bin/python3`
			`runner_extra_args:`
			`- "-k"`
			`- "_utils_ and not _utils_cache and not _utils_http and not paginate and not git"`
			```

			You can override the runners command with `runner_command`, which is useful when setting up dependencies in a venv.

			`## References`

			`testbed was inspired by [human-eval](https://github.com/openai/human-eval) and [RepoEval](https://arxiv.org/abs/2303.12570).`