regex-redux F# .NET program
source code
// The Computer Language Benchmarks Game
// https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
//
// regex-dna program modified version of Valentin Kraevskiy
// contributed by Vassil Keremidchiev
// converted from regex-dna program
open System.Text.RegularExpressions
open System.Threading
let regex s = Regex (s, RegexOptions.Compiled)
let input = stdin.ReadToEnd ()
let withoutComments = (regex ">.*\n").Replace (input, "")
let text = (regex "\n").Replace (withoutComments, "")
let rec onblocks res s =
let size = 1024*4096
match s with
| "" -> res
| s when (s.Length < size) -> res @ [s]
| s -> onblocks (res @ [s.Substring(0, size)]) (s.Substring(size))
["agggtaaa|tttaccct"
"[cgt]gggtaaa|tttaccc[acg]"
"a[act]ggtaaa|tttacc[agt]t"
"ag[act]gtaaa|tttac[agt]ct"
"agg[act]taaa|ttta[agt]cct"
"aggg[acg]aaa|ttt[cgt]ccct"
"agggt[cgt]aa|tt[acg]accct"
"agggta[cgt]a|t[acg]taccct"
"agggtaa[cgt]|[acg]ttaccct"]
|> List.map (fun s -> async {
return System.String.Format( "{0} {1}", s,
((regex s).Matches text).Count) } )
|> Async.Parallel |> Async.RunSynchronously
|> Array.iter (printfn "%s")
let newTextLength t =
["tHa[Nt]", "<4>"
"aND|caN|Ha[DS]|WaS", "<3>"
"a[NSt]|BY", "<2>"
"<[^>]*>", "|"
"\\|[^|][^|]*\\|" , "-"]
|> List.fold (fun s (code, alt) -> (regex code).Replace (s, alt)) t
|> String.length
let newText =
text |> onblocks []
|> Seq.map (fun s -> async { return newTextLength s } )
|> Async.Parallel |> Async.RunSynchronously
|> Array.sum
printf "\n%i\n%i\n%i\n" input.Length text.Length newText
notes, command-line, and program output
NOTES:
64-bit Ubuntu quad core
.NET SDK 9.0.100
Host Version: 9.0.0
Commit: 9d5a6a9aa4
<OutputType>Exe
<TargetFramework>net9.0
<ImplicitUsings>enable
<Nullable>enable
<AllowUnsafeBlocks>true
<ServerGarbageCollection>true
<ConcurrentGarbageCollection>true
<PublishAot>false
Fri, 15 Nov 2024 02:21:30 GMT
MAKE:
cp regexredux.fsharpcore Program.fs
cp Include/fsharpcore/program.fsproj .
mkdir obj
cp Include/fsharpcore/project.assets.json ./obj
/opt/src/dotnet-sdk-9.0.100/dotnet build -c Release --use-current-runtime
Determining projects to restore...
/home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/program.fsproj : warning NU1900: Error occurred while getting package vulnerability data: Unable to load the service index for source https://api.nuget.org/v3/index.json.
Restored /home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/program.fsproj (in 6.2 sec).
/home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/program.fsproj : warning NU1900: Error occurred while getting package vulnerability data: Unable to load the service index for source https://api.nuget.org/v3/index.json.
program -> /home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/bin/Release/net9.0/linux-x64/program.dll
Build succeeded.
/home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/program.fsproj : warning NU1900: Error occurred while getting package vulnerability data: Unable to load the service index for source https://api.nuget.org/v3/index.json.
/home/dunham/all-benchmarksgame/benchmarksgame_i53330/regexredux/tmp/program.fsproj : warning NU1900: Error occurred while getting package vulnerability data: Unable to load the service index for source https://api.nuget.org/v3/index.json.
2 Warning(s)
0 Error(s)
Time Elapsed 00:00:15.03
17.23s to complete and log all make actions
COMMAND LINE:
./bin/Release/net9.0/linux-x64/program 0 < regexredux-input500000.txt
UNEXPECTED OUTPUT
13c13
< 2739399
---
> 2739360
PROGRAM OUTPUT:
agggtaaa|tttaccct 36
[cgt]gggtaaa|tttaccc[acg] 125
a[act]ggtaaa|tttacc[agt]t 426
ag[act]gtaaa|tttac[agt]ct 290
agg[act]taaa|ttta[agt]cct 536
aggg[acg]aaa|ttt[cgt]ccct 153
agggt[cgt]aa|tt[acg]accct 143
agggta[cgt]a|t[acg]taccct 160
agggtaa[cgt]|[acg]ttaccct 219
5083411
5000000
2739399