trdsql is a CLI tool that can execute SQL queries on CSV, LTSV, JSON, YAML and TBLN files.
This tool is similar to others such as q and textql, with a key distinction: it allows the use of PostgreSQL or MySQL syntax.
For usage as a library, please refer to the godoc and the provided examples.
-
- 1.1. go get
- 1.1.1. Requirements
- 1.2. Download binary
- 1.3. Homebrew
- 1.4. MacPorts
- 1.5. FreeBSD
- 1.6. Cgo free
- 1.1. go get
-
- 2.1. Docker pull
- 2.2. image build
- 2.3. Docker Run
-
- 3.1. Global options
- 3.2. Input formats
- 3.2.1. Input options
- 3.3. Output formats
- 3.3.1. Output options
- 3.4. Handling of NULL
- 3.5. Multiple queries
-
- 4.1. STDIN input
- 4.2. Multiple files
- 4.3. Compressed files
- 4.4. Output file
- 4.5. Output compression
- 4.6. Guess by output file name
- 4.7. Columns is not constant
- 4.8. TSV (Tab Separated Value)
- 4.9. LTSV (Labeled Tab-separated Values)
- 4.10. JSON
- 4.10.1. jq expression
- 4.11. JSONL(NDJSON)
- 4.12. YAML
- 4.13. TBLN
- 4.14. WIDTH
- 4.15. TEXT
- 4.16. Raw output
- 4.17. ASCII Table & MarkDown output
- 4.18. Vertical format output
-
- 5.1. SQL function
- 5.2. JOIN
- 5.3. PostgreSQL
- 5.3.1. Function
- 5.3.2. Join table and CSV file is possible
- 5.4. MySQL
- 5.5. Analyze
- 5.6. Configuration
1. INSTALL
1.1. go get
go get -d github.com/noborus/trdsql cd $GOPATH/src/github.com/noborus/trdsql make make install
1.1.1. Requirements
go 1.21 or higher.
1.2. Download binary
Download binary from the releases page(Linux/Windows/macOS).
1.3. Homebrew
brew install noborus/tap/trdsql1.4. MacPorts
sudo port selfupdate sudo port install trdsql
1.5. FreeBSD
1.6. Cgo free
Typically, go-sqlite3 is used for building.
However, if you're building with CGO_ENABLED=0, consider using sqlite instead.
Building without CGO (CGO Free) can reduce issues related to cross-compiling, but it may result in slower execution times.
2. Docker
2.1. Docker pull
Pull the latest image from the Docker hub.
docker pull noborus/trdsql2.2. image build
Or build it yourself.
2.3. Docker Run
Docker run.
docker run --rm -it -v $(pwd):/tmp trdsql [options and commands]3. Usage
To use trdsql, you can either specify an SQL query or simply provide a file for conversion.
For file conversion, this is equivalent to executing 'trdsql -o[output format] "SELECT * FROM filename"'.
trdsql -o[output format] -t [input filename]3.1. Global options
-afilename analyze the file and suggest SQL.-Afilename analyze the file but only suggest SQL.-configfilename configuration file location.-dbdb name specify db name of the setting.-dblistdisplay db list of configure.-driverstring database driver. [ mysql | postgres | sqlite3 | sqlite(CGO Free) ] (default "sqlite3")-dsnstring database driver specific data source name.-debugdebug print.-helpdisplay usage information.-qfilename read query from the specified file.-tfilename read table name from the specified file.-versiondisplay version information.
3.2. Input formats
-igguess format from extension. (default)-icsvCSV format for input.-ijsonJSON format for input.-iltsvLTSV format for input.-iyamlYAML format for input.-itblnTBLN format for input.-iwidthwidth specification format for input.-itexttext format for input.
3.2.1. Input options
-ihthe first line is interpreted as column names(CSV only).-idcharacter field delimiter for input(default ",")(CSV only).-ijqstring jq expression string for input(JSON/JSONL only).-ilrint limited number of rows to read.-inullstring value(string) to convert to null on input.-inumadd row number column.-irint number of rows to preread. (default 1)-isint skip header row.
3.3. Output formats
-ocsvCSV format for output. (default)-ojsonJSON format for output.-ojsonlJSONL(JSON Lines) format for output.-oltsvLTSV format for output.-oatASCII Table format for output.-omdMarkdown format for output.-orawRaw format for output.-ovfVertical format for output.-oyamlYAML format for output.-otblnTBLN format for output.
Or, guess the output format by file name.
3.3.1. Output options
-outfilename output file name.-out-without-guessoutput without guessing (when using -out).-ohoutput column name as header.-odcharacter field delimiter for output. (default ",")(CSV and RAW only).-oqcharacter quote character for output. (default """)(CSV only).-oaqenclose all fields in quotes for output(CSV only).-ocrlfuse CRLF for output. End each output line with '\r\n' instead of '\n'."(CSV only).-onowrapdo not wrap long columns(AT and MD only).-onullvalue(string) to convert from null on output.-ozstring compression format for output. [ gzip | bz2 | zstd | lz4 | xz ]
3.4. Handling of NULL
NULL is undecided in many text formats.
JSON null is considered the same as SQL NULL.
For formats other than JSON, you must specify a string that is considered NULL.
In most cases you will need to specify an empty string ("").
If -inull "" is specified, an empty string will be treated as SQL NULL.
SQL NULL is an empty string by default. Specify the -onull "(NULL)" option if you want a different string.
$ echo "1,,v" | trdsql -inull "" -onull "(NULL)" "SELECT * FROM -" 1,(NULL),v
In the case of JSON, null is NULL as it is, and the specified string is converted to NULL.
$ echo '[1,null,""]' | trdsql -inull "" -ojson -ijson "SELECT * FROM -" [ { "c1": "1" }, { "c1": null }, { "c1": null } ]
3.5. Multiple queries
Multiple queries can be executed by separating them with a semicolon. Update queries must be followed by a SELECT statement.
$ trdsql "UPDATE SET c2='banana' WHERE c3='1';SELECT * FROM test.csv" 1,Orange 2,Melon 3,banana
You can perform multiple SELECTs, but the output will be in one format.
$ trdsql -oh "SELECT c1,c2 FROM test.csv;SELECT c2,c1 FROM test.csv" c1,c2 1,Orange 2,Melon 3,Apple c2,c1 Orange,1 Melon,2 Apple,3
4. Example
test.csv file.
Please write a file name like a table name.
trdsql "SELECT * FROM test.csv"-q filename can execute SQL from file
4.1. STDIN input
"-" or "stdin" is received from standard input instead of file name.
cat test.csv | trdsql "SELECT * FROM -"or
cat test.csv | trdsql "SELECT * FROM stdin"4.2. Multiple files
Multiple matched files can be executed as one table.
$ trdsql -ih "SELECT * FROM tt*.csv" 1,test1 2,test2 3,test3
Note
It is not possible to mix different formats (ex: CSV and LTSV).
4.3. Compressed files
If the file is compressed with gzip, bz2, zstd, lz4, xz, it will be automatically uncompressed.
trdsql "SELECT * FROM testdata/test.csv.gz"trdsql "SELECT * FROM testdata/test.csv.zst"It is possible to mix uncompressed and compressed files using wildcards.
trdsql "SELECT * FROM testdata/test.csv*"4.4. Output file
-out filename option to output the file to a file.
trdsql -out result.csv "SELECT * FROM testdata/test.csv ORDER BY c1"4.5. Output compression
-oz compression type to compress and output.
trdsql -oz gz "SELECT * FROM testdata/test.csv ORDER BY c1" > result.csv.gz4.6. Guess by output file name
The filename of -out filename option determines
the output format(csv, ltsv, json, tbln, raw, md, at, vf, jsonl)
and compression format(gzip, bz2, zstd,lz4, xz) by guess.
Guess by extension output format + output compression (eg .csv.gz, .ltsv.lz4, .md.zst ...).
The following is an LTSV file compressed in zstd.
trdsql -out result.ltsv.zst "SELECT * FROM testdata/test.csv"4.7. Columns is not constant
If the number of columns is not a constant, read and decide multiple rows.
$ trdsql -ir 3 -iltsv "SELECT * FROM test_indefinite.ltsv" 1,Orange,50,, 2,Melon,500,ibaraki, 3,Apple,100,aomori,red
4.8. TSV (Tab Separated Value)
-id "\\t" is input from TSV (Tab Separated Value)
trdsql -id "\t" "SELECT * FROM test-tab.csv"-od "\\t" is TSV (Tab Separated Value) output.
$ trdsql -od "\t" "SELECT * FROM test.csv" 1 Orange 2 Melon 3 Apple
4.9. LTSV (Labeled Tab-separated Values)
-iltsv is input from LTSV(Labeled Tab-separated Values).
sample.ltsv
id:1 name:Orange price:50
id:2 name:Melon price:500
id:3 name:Apple price:100
trdsql -iltsv "SELECT * FROM sample.ltsv"1,Orange,50
2,Melon,500
3,Apple,100
Note
Only the columns in the first row are targeted.
-oltsv is LTSV(Labeled Tab-separated Values) output.
$ trdsql -iltsv -oltsv "SELECT * FROM sample.ltsv" id:1 name:Orange price:50 id:2 name:Melon price:500 id:3 name:Apple price:100
4.10. JSON
-ijson is input from JSON.
sample.json
[
{
"id": "1",
"name": "Orange",
"price": "50"
},
{
"id": "2",
"name": "Melon",
"price": "500"
},
{
"id": "3",
"name": "Apple",
"price": "100"
}
]$ trdsql -ijson "SELECT * FROM sample.json" 1,Orange,50 2,Melon,500 3,Apple,100
JSON can contain structured types, but trdsql is stored as it is as JSON string.
sample2.json
[
{
"id": 1,
"name": "Drolet",
"attribute": { "country": "Maldives", "color": "burlywood" }
},
{
"id": 2,
"name": "Shelly",
"attribute": { "country": "Yemen", "color": "plum" }
},
{
"id": 3,
"name": "Tuck",
"attribute": { "country": "Mayotte", "color": "antiquewhite" }
}
]$ trdsql -ijson "SELECT * FROM sample2.json" 1,Drolet,"{""color"":""burlywood"",""country"":""Maldives""}" 2,Shelly,"{""color"":""plum"",""country"":""Yemen""}" 3,Tuck,"{""color"":""antiquewhite"",""country"":""Mayotte""}"
Please use SQL function.
- SQLite3 - JSON Functions And Operators
- PostgreSQL - JSON Functions and Operators
- MySQL - Functions That Search JSON Values
$ trdsql -ijson "SELECT id, name, JSON_EXTRACT(attribute,'$.country'), JSON_EXTRACT(attribute,'$.color') FROM sample2.json" 1,Drolet,Maldives,burlywood 2,Shelly,Yemen,plum 3,Tuck,Mayotte,antiquewhite
4.10.1. jq expression
If json has a hierarchy, you can filter by jq expression.
The jq expression is implemented using gojq.
menu.json
{
"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{
"value": "New",
"onclick": "CreateDoc()"
},
{
"value": "Open",
"onclick": "OpenDoc()"
},
{
"value": "Save",
"onclick": "SaveDoc()"
}
]
}
}
}You can write a jq expression by connecting :: after the json file name. Enclose the jq expression in double quotes if needed.
trdsql -oat 'SELECT value, onclick FROM menu.json::".menu.popup.menuitem"'Or specify with the -ijq option.
$ trdsql -oat -ijq ".menu.popup.menuitem" "SELECT * FROM menu.json" +-------+-------------+ | value | onclick | +-------+-------------+ | New | CreateDoc() | | Open | OpenDoc() | | Save | SaveDoc() | +-------+-------------+
Example to use instead of gojq.
$ echo '{"foo": 128}' | trdsql -ijson "SELECT * FROM -::'.foo'" 128 $ echo '{"a": {"b": 42}}' | trdsql -ijson "SELECT * FROM -::'.a.b'" 42 $ echo '{"id": "sample", "10": {"b": 42}}' | trdsql -ijson "SELECT * FROM -::'{(.id): .[\"10\"].b}'" 42 $ echo '[{"id":1},{"id":2},{"id":3}]' | trdsql -ijson "SELECT * FROM -::'.[] | .id'" 1 2 3 $ echo '{"a":1,"b":2}' | trdsql -ijson "SELECT * FROM -::'.a += 1 | .b *= 2'" 4,2 $ echo '{"a":1} [2] 3' | trdsql -ijson "SELECT * FROM -::'. as {\$a} ?// [\$a] ?// \$a | \$a'" 1 2 3
4.11. JSONL(NDJSON)
Another json format. JSONL(JSON Lines). It is also called ndjson.
sample2.json
{"id": "1","name": "Orange","price": "50"}
{"id": "2","name": "Melon","price": "500"}
{"id": "3","name": "Apple","price": "100"}-ojson is JSON Output.
$ trdsql -ojson "SELECT * FROM test.csv" [ { "c1": "1", "c2": "Orange" }, { "c1": "2", "c2": "Melon" }, { "c1": "3", "c2": "Apple" } ]
To output in JSONL, specify -ojsonl.
$ trdsql -ojsonl "SELECT * FROM test.csv" {"c1":"1","c2":"Orange"} {"c1":"2","c2":"Melon"} {"c1":"3","c2":"Apple"}
4.12. YAML
-iyaml is input from YAML
(Or if the extension is yaml or yml, it is considered a YAML file).
sample.yaml
- id: 1 name: Orange price: 50 - id: 2 name: Melon price: 500 - id: 3 name: Apple price: 100
$ trdsql -iyaml -ocsv "SELECT * FROM sample.yaml" 1,Orange,50 2,Melon,500 3,Apple,100
Since yaml is internally converted to JSON, it can be converted to json and output.
sample2.yaml
a: true b: c: 2 d: [3, 4, 5] e: - name: fred value: 3 - name: sam value: 4%
$ trdsql -ojson "SELECT * FROM sample2.yaml" [ { "a": "true", "b": { "c": 2, "d": [ 3, 4, 5 ], "e": [ { "name": "fred", "value": 3 }, { "name": "sam", "value": "4%" } ] } } ]
So in addition you can also use jq syntax.
$ trdsql -ojson "SELECT * FROM sample2.yaml::.b.e" [ { "name": "fred", "value": "3" }, { "name": "sam", "value": "4%" } ]
json can be converted to yaml.
$ trdsql -ojson "SELECT * FROM sample2.yaml::.b.e" - name: fred value: 3 - name: sam value: 4%
4.13. TBLN
-itbln is input from TBLN.
sample.tbln
; name: | id | name |
; type: | int | text |
| 1 | Bob |
| 2 | Alice |
$ trdsql -itbln "SELECT * FROM sample.tbln" 1,Bob 2,Alice
TBLN file reflects extras name and type.
-otbln is TBLN Output.
$ trdsql -otbln "SELECT c1::int as id, c2::text as name FROM test.csv" ; created_at: 2019-03-22T13:20:31+09:00 ; name: | id | name | ; type: | int | text | | 1 | Orange | | 2 | Melon | | 3 | Apple |
TBLN can contain column names and type definitions. Please refer to https://tbln.dev/ for details of TBLN.
4.14. WIDTH
-iwidth inputs the format specifying the width.
This is used when the header column width represents the body column width.
$ ps | trdsql -oh -iwidth "SELECT * FROM -" PID,TTY,TIME,CMD 302965,pts/3,00:00:12,zsh 733211,pts/3,00:00:00,ps 733212,pts/3,00:00:00,tee 733213,pts/3,00:00:00,guesswidth
-id " " for CSV also works in many cases.
But -id " " does not recognize spaces in columns very well.
-iwidth recognizes column widths and space separators.
4.15. TEXT
The -itext option or files with “.text”extension are in text format.
This is a one line to one column format.
A blank line is also a line, unlike the CSV format.
$ cat test.text a b c $ trdsql -itext "SELECT * FROM test.text" a b c
It is useful in conjunction with the -inum option.
$ trdsql -inum "SELECT * FROM test.text" 1,a 2, 3,b 4, 5,c
4.16. Raw output
-oraw is Raw Output.
It is used when "escape processing is unnecessary" in CSV output.
(For example, when outputting JSON in the database).
$ trdsql -oraw "SELECT row_to_json(t,TRUE) FROM test.csv AS t" {"c1":"1", "c2":"Orange"} {"c1":"2", "c2":"Melon"} {"c1":"3", "c2":"Apple"}
Multiple delimiter characters can be used for raw.
$ trdsql -oraw -od "\t|\t" -db pdb "SELECT * FROM test.csv" 1 | Orange 2 | Melon 3 | Apple
4.17. ASCII Table & MarkDown output
-oat is ASCII table output.
$ trdsql -oat "SELECT * FROM test.csv" +----+--------+ | C1 | C2 | +----+--------+ | 1 | Orange | | 2 | Melon | | 3 | Apple | +----+--------+
-omd is Markdown output.
$ trdsql -omd "SELECT * FROM test.csv" | C1 | C2 | |----|--------| | 1 | Orange | | 2 | Melon | | 3 | Apple |
The -onowrap option does not wrap long columns in at or md output.
4.18. Vertical format output
-ovf is Vertical format output("column name | value" vertically).
$ trdsql -ovf "SELECT * FROM test.csv" ---[ 1]-------------------------------------------------------- c1 | 1 c2 | Orange ---[ 2]-------------------------------------------------------- c1 | 2 c2 | Melon ---[ 3]-------------------------------------------------------- c1 | 3 c2 | Apple
5. SQL
5.1. SQL function
$ trdsql "SELECT count(*) FROM test.csv" 3
The default column names are c1, c2,...
$ trdsql "SELECT c2,c1 FROM test.csv" Orange,1 Melon,2 Apple,3
Note
the available functions and their syntax depend on the driver you have chosen (mysql or postgres or sqlite). The default one is sqlite.
5.2. JOIN
The SQL JOIN can be used.
user.csv
hist.csv
1,2017-7-10 2,2017-7-10 2,2017-7-11
$ trdsql "SELECT u.c1,u.c2,h.c2 FROM user.csv as u LEFT JOIN hist.csv as h ON(u.c1=h.c1)" 1,userA,2017-7-10 2,userB,2017-7-10 2,userB,2017-7-11
5.3. PostgreSQL
When using PostgreSQL, specify postgres for driver and driver-specific data source name for dsn.
trdsql -driver postgres -dsn "dbname=test" "SELECT count(*) FROM test.csv "5.3.1. Function
The PostgreSQL driver can use the window function.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT row_number() OVER (ORDER BY c2),c1,c2 FROM test.csv" 1,3,Apple 2,2,Melon 3,1,Orange
For example, the generate_series function can be used.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT generate_series(1,3);" 1 2 3
5.3.2. Join table and CSV file is possible
Test database has a colors table.
$ psql test -c "SELECT * FROM colors" id | name ----+-------- 1 | orange 2 | green 3 | red (3 rows)
Join table and CSV file.
$ trdsql -driver postgres -dsn "dbname=test" "SELECT t.c1,t.c2,c.name FROM test.csv AS t LEFT JOIN colors AS c ON (t.c1::int = c.id)" 1,Orange,orange 2,Melon,green 3,Apple,red
To create a table from a file, use "CREATE TABLE ... AS SELECT...".
trdsql -driver postgres -dns "dbname=test" "CREATE TABLE fruits (id, name) AS SELECT c1::int, c2 FROM fruits.csv "$ psql -c "SELECT * FROM fruits;" id | name ----+-------- 1 | Orange 2 | Melon 3 | Apple (3 rows)
5.4. MySQL
When using MySQL, specify mysql for driver and connection information for dsn.
$ trdsql -driver mysql -dsn "user:password@/test" "SELECT GROUP_CONCAT(c2 ORDER BY c2 DESC) FROM testdata/test.csv" "g,d,a"
$ trdsql -driver mysql -dsn "user:password@/test" "SELECT c1, SHA2(c2,224) FROM test.csv" 1,a063876767f00792bac16d0dac57457fc88863709361a1bb33f13dfb 2,2e7906d37e9523efeefb6fd2bc3be6b3f2991678427bedc296f9ddb6 3,d0b8d1d417a45c7c58202f55cbb617865f1ef72c606f9bce54322802
MySQL can join tables and CSV files as well as PostgreSQL.
5.5. Analyze
The -a filename option parses the file and outputs table information and SQL examples.
$ trdsql -a testdata/test.ltsv The table name is testdata/header.csv. The file type is CSV. Data types: +-------------+------+ | column name | type | +-------------+------+ | id | text | | \`name\` | text | +-------------+------+ Data samples: +----+----------+ | id | \`name\` | +----+----------+ | 1 | Orange | +----+----------+ Examples: trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv" trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv WHERE id = '1'" trdsql -db sdb -ih "SELECT id, count(id) FROM testdata/header.csv GROUP BY id" trdsql -db sdb -ih "SELECT id, \`name\` FROM testdata/header.csv ORDER BY id LIMIT 10"
Other options(-id,-ih,-ir,-is,icsv,iltsv,-ijson,-itbln...) are available.
trdsql -ih -a testdata/header.csvSimilarly, with -A filename option, only Examples (SQL) is output.
$ trdsql -ih -A testdata/header.csv trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv" trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv WHERE id = '1'" trdsql -ih "SELECT id, count(id) FROM testdata/header.csv GROUP BY id" trdsql -ih "SELECT id, \`name\` FROM testdata/header.csv ORDER BY id LIMIT 10"
5.6. Configuration
You can specify driver and dsn in the configuration file.
Unix like.
${HOME}/.config/trdsql/config.json
Windows (ex).
C:\Users\{"User"}\AppData\Roaming\trdsql\config.json
Or use the -config file option.
trdsql -config config.json "SELECT * FROM test.csv"sample: config.json
{
"db": "pdb",
"database": {
"sdb": {
"driver": "sqlite3",
"dsn": ""
},
"pdb": {
"driver": "postgres",
"dsn": "user=test dbname=test"
},
"mdb": {
"driver": "mysql",
"dsn": "user:password@/dbname"
}
}
}The default database is an entry of "db".
If you put the setting in you can specify the name with -db.
$ trdsql -debug -db pdb "SELECT * FROM test.csv" 2017/07/18 02:27:47 driver: postgres, dsn: user=test dbname=test 2017/07/18 02:27:47 CREATE TEMPORARY TABLE "test.csv" ( c1 text,c2 text ); 2017/07/18 02:27:47 INSERT INTO "test.csv" (c1,c2) VALUES ($1,$2); 2017/07/18 02:27:47 SELECT * FROM "test.csv" 1,Orange 2,Melon 3,Apple
6. Library
Example of use as a library.
package main import ( "log" "github.com/noborus/trdsql" ) func main() { trd := trdsql.NewTRDSQL( trdsql.NewImporter(trdsql.InDelimiter(":")), trdsql.NewExporter(trdsql.NewWriter()), ) if err := trd.Exec("SELECT c1 FROM /etc/passwd"); err != nil { log.Fatal(err) } }
Please refer to godoc and _example for usage as a library.
7. See also
- psutilsql - A tool for querying system status in SQL.
- mdtsql - A tool for querying markdown tables in SQL.
- xlsxsql - A tool for querying Excel files in SQL.
8. Learn More
- https://noborus.github.io/trdsql/index.html
- https://cn.x-cmd.com/pkg/trdsql#trdsql
- https://x-cmd.com/1min/trdsql#trdsql
9. License
MIT
Please check each license of SQL driver.
