SQL Pivot: Converting Rows to Columns

77 points by dmatrix 7 years ago · 23 comments

Reader

danso 7 years ago

I vaguely understand (on a theoretical level) why the syntax for pivoting in databases is necessarily convoluted, but it does underscore to me how spreadsheets have a great advantage for everyday data-exploring, at least when it comes to ease of pivoting.

(by "everyday", I mean working with datasets small enough to fit within a spreadsheet's limits, and aren't yet important enough to need the reliability/regularity that comes with importing it to a database)

nerdponx 7 years ago

The ability to use Excel as a "front-end" for other data sources is underrated. For all its warts, Excel is a great data manipulation and exploration tool.
- BariumBlue 7 years ago
  
  Due to system limitations, I made a stats tracker for my team in AccessDB. Due to the awfulness of access, I made an excel&vba front-end that would push data to and pull data from access.
  It's been working surprisingly well so far, with no reported issues
  - bhandziuk 7 years ago
    
    I'm curious what you found to be the brickwall for you when using Access?
    
    BariumBlue 7 years ago
    
    The existing "solution" was a gigantic excel where people would put it times and names.
    In Access, when you enter times, you have to use a specific format, (the datetime group specified by the Windows default) rather than just "23:40" or w/e. I've seen a hack where the hour and minutes are separate boxes, but it wasn't pretty. Just this issue was enough to make Access unappealing.
    Access's interface is a bit unintuitive, and given that most people I worked with weren't particularly technically capable, I preferred to present them with the familiar Excel interface (and a few extra buttons), rather than teach them an application they've never used before.
    Plus, the existing workflow already used Excel, so I hijacked their workflow and their routine wouldn't be modified much at all (introducing a completely new system would cause a lot of friction and pushback, but one or two extra buttons was fine).
    
    bhandziuk 7 years ago
    
    Huh. Next time you can set the date format to "mm/dd/yyyy hh:nn" and you can enter 24 hour times with a date.
    But I understand wanting to not disrupt an existing workflow too much.

karkisuni 7 years ago

I wish there was something like this in postgres. tablefunc is a lot more complicated than this syntax.

netcraft 7 years ago

There are a couple ways to pivot in postgres, here is one that uses json functionality https://gist.github.com/ryanguill/101a19fb6ae6dfb26a01396c53...
forkandwait 7 years ago

I would like something with simple functionality, where I can just specify the table name, the category column, and the data column and get a new table transposed ibto columns. I would be happy to do the aggregation outside the pivot process. PROC TRANSPOSE in SAS would be a good model.
GorgeRonde 7 years ago

Dynamic row to column pivotation/transpose in Postgres made simple.
https://github.com/hnsl/colpivot

nevi-me 7 years ago

I had a quick look at some online material on implementations of pivot, as well as what I consider my go-to reference [0] for Apache projects' SQL support (Apache Calcite).

Looks like each DB has a proprietary way of pivoting, and I couldn't find anything on pivot in Calcite. Does anyone know if the SQL impl of pivot was created specifically for Spark, or whether it'd be upstreamed to Calcite? The benefit would be that other projects that rely on Calcite for SQL support would also have pivot capabilities. I'm thinking of Apache Beam.

- [0] https://calcite.apache.org/docs/reference.html

reilly3000 7 years ago

This article enlightened me on the underlying logic, as well as some different implementations. https://modern-sql.com/use-case/pivot
- nevi-me 7 years ago
  
  I was lazy to add its link, that's where I learnt about the proprietary implementations of various databases.
  I wonder if we'll ever get an ANSI-2019 SQL version that tries to merge what various DB vendors have branched off and done. Maybe it already exists? https://www.whoishostingthis.com/resources/ansi-sql-standard...
rxin 7 years ago

Apache Spark doesn't use Calcite.

IshKebab 7 years ago

This is very nice. I've currently had to resort to MariaDB's dynamic columns for my purposes (much nicer than a traditional entity-attribute-value solution) but this looks even nicer. Is this a Spark-only thing?

51Cards 7 years ago

mySQL needs this as well. We do a lot of pivot style reports for clients and while we have it down to a bit of a science now the resulting SQL is a debugging mess and not terribly efficient. (though it still runs fairly quick)

m0dest 7 years ago

I'd love to see this in Presto. It would make my life 1000x easier.

gildas 7 years ago

Off topic: the GPDR banner on this site is a good example of dark pattern. Scrolling into a webpage does not mean I consent collecting personal info about me.

gdhbcc 7 years ago

It's a good example of GDPR non-compliance too.
dspillett 7 years ago

I didn't even get a banner, so presumably I'm assumed to have consented without having taken any action.
ediardo 7 years ago

I can't see the banner...
gildas 7 years ago

Downvoting won't make it more compliant...
Edit: I thought we could discuss about dark patterns on HN. I was obviously wrong. Sorry for that...
- dspillett 7 years ago
  
  The downvotes on the original post are probably because the comment is being seen as off topic. Sometimes comments like that are accepted, sometimes not, I can't say I've noticed much of a patter there other than they are far less welcome on HN than, for example, Reddit.
  I usually play safe and reserve side-topic comments to a post-script after something more immediately relevant (and if I don't have something to say on the thread topic, keep them to myself).
  The downvotes on this one are going to be because of complaining about downvotes!

Settings

SQL Pivot: Converting Rows to Columns

Keyboard Shortcuts