Model
Performance of gpt-4.1-mini across tasks.
| Task | Attempt ID | Cost | Duration | Status / Error |
|---|---|---|---|---|
| coreutils-old-version-alpine | 0jd0ssv7t8ej8 | $1.626 | 10m17s | Failure exceeded max tool calls (200) |
| coreutils-old-version-alpine | 0uyfhajvzi84t | $0.570 | 5m40s | Failure task failed: sha1sum binary does not exist |
| coreutils-old-version-alpine | deuztodwd8fyw | $0.015 | 32s | Failure task failed: sha1sum binary does not exist |
| coreutils-old-version | 0czs7cd61cgiv | $0.428 | 4m24s | Failure exceeded max tool calls (90) |
| coreutils-old-version | pk3sw083z1mhf | $0.045 | 15m0s | Success |
| coreutils-old-version | r0va6gg2yz6cw | $0.289 | 4m6s | Failure exceeded max tool calls (90) |
| coreutils-static-alpine | 41v4f1oidpjf3 | $0.005 | 54s | Success |
| coreutils-static-alpine | fiieaupeprxhx | $0.005 | 56s | Success |
| coreutils-static-alpine | w2nie94mxcrcu | $0.005 | 55s | Success |
| coreutils-static | i1gcm32w6y8lt | $0.020 | 1m28s | Failure task failed: install missing at /home/peter/result/install or not executable |
| coreutils-static | m6jory3zzv0yy | $0.005 | 1m24s | Success |
| coreutils-static | ny4qbms66bcj2 | $0.005 | 1m11s | Success |
| coreutils | 2p70j91hufzfc | $0.003 | 55s | Success |
| coreutils | g0ufozl8ppx4u | $0.007 | 1m8s | Success |
| coreutils | hkho8smf3400h | $0.007 | 56s | Success |
| cowsay | 9awhy8x6j9ur4 | $0.003 | 12s | Failure task failed: Cowsay does not contain expected string (eyes) |
| cowsay | qsr9bokvjmfhz | $0.004 | 17s | Success |
| cowsay | scf6zd1ml423x | $0.003 | 10m9s | Failure context timeout: context deadline exceeded |
| curl-ssl-arm64-static | 012ghbq53dtrb | $0.052 | 1m53s | Failure task failed: curl binary does not exist |
| curl-ssl-arm64-static | btrneoynqi8wh | $0.100 | 2m3s | Failure task failed: curl binary does not exist |
| curl-ssl-arm64-static | nkse2jbrzo0ax | $0.031 | 1m13s | Failure task failed: curl binary does not exist |
| curl-ssl-arm64-static2 | 0p9s0zbpxhkl2 | $0.035 | 1m36s | Failure task failed: curl binary does not exist |
| curl-ssl-arm64-static2 | azgepyb5or5dc | $0.029 | 57s | Failure task failed: curl binary does not exist |
| curl-ssl-arm64-static2 | ff6i96z03vwjm | $0.244 | 3m50s | Failure task failed: curl binary does not exist |
| curl-ssl | 21pe0ozs8m7fm | $0.005 | 2m5s | Success |
| curl-ssl | ki2dypbz4tpv8 | $0.008 | 47s | Success |
| curl-ssl | p1htls0nvlfqc | $0.010 | 53s | Success |
| curl | 832y21jw8tb5e | $0.002 | 22s | Success |
| curl | aewcboi7hghxu | $0.034 | 1m17s | Success |
| curl | ofs50jzdw2i0b | $0.012 | 41s | Success |
| jq-static-musl | s8y2djf3d5u39 | $0.007 | 47s | Failure task failed: jq is not statically linked |
| jq-static-musl | sz34lbz4tfzla | $0.007 | 39s | Failure task failed: jq is not statically linked |
| jq-static-musl | trwl27b2sjave | $0.006 | 27s | Failure task failed: jq is not statically linked |
| jq-static | m1dtr4r8agieq | $0.015 | 1m30s | Failure task failed: jq is not statically linked |
| jq-static | rrnjfbyjuj24d | $0.007 | 1m5s | Failure task failed: jq is not statically linked |
| jq-static | v155i3jgsrqgx | $0.019 | 2m8s | Failure task failed: jq is not statically linked |
| jq-windows | da2b7m8uzzkrc | $0.007 | 36s | Success |
| jq-windows | fc9g3soaojv2j | $0.009 | 37s | Failure task failed: jq help does not contain expected string |
| jq-windows | p4h15s794y3xh | $0.005 | 45s | Failure task failed: jq help does not contain expected string |
| jq-windows2 | 2tzmylmf760ao | $0.022 | 1m57s | Failure task failed: jq help does not contain expected string |
| jq-windows2 | 80hvhaadxr0t2 | $0.032 | 1m59s | Success |
| jq-windows2 | fe91iqj9w2773 | $0.023 | 2m4s | Failure task failed: jq help does not contain expected string |
| jq | 81t1qjczdy22d | $0.002 | 36s | Success |
| jq | gnbm39zwmwk5n | $0.002 | 37s | Success |
| jq | r11pfzhpwmo4t | $0.008 | 29s | Success |