April 29, 2026 | Permalink
One of the difficulties of working with data about the labor movement is that it is often difficult to know which labor organization you are dealing with.
labor-union-parser is a new Python package to lookup the a local union’s Office of Management and Labor Standards (OLMS) filing number from short texts like “United Automobile, Aerospace and Agricultural Implement Workers of America, UAW Local 1803” and “LOCAL 6, NEW YORK HOTEL & MOTEL TRADES COUNCIL, UNITE HERE.”
Most labor organizations representing private sector workers are required to file an annual financial report with the Department of Labor’s Office of Management and Labor Standards. The Office maintains a more or less consistent filing number for unions across the filings, and so office’s data is the closest we have to a comprehensive gazette for private-sector-representing labor unions.
This tool uses a probabilistic model to do the lookup, and it’s pretty accurate. It predicts three things, is a text referring to a union, what is the union affiliation of the local; what is the filing number of the union.
| Metric | Score |
|---|---|
| End-to-End Accuracy | 97.8% |
| is union accuracy | 99.2% (4402/4437) |
| filing number accuracy | 98.3% (3804/3868) |
| union name accuracy | 97.8% (4665/4771) |
| Wrong match (union, wrong filing num) | 64 |
| False negatives (union missed) | 8 |
| False positives (non-union matched) | 27 |
In our test set, for the records we were able to identify as referring to particular filing number we get 98.3% accuracy. As we’ll see the performance is often better than that.
I have this wired up into labordata.bunkum.us, where it runs every night against any column in any table that refers to unions. Here’s a query of the number of distinct election petition cases that unions participated in the last quarter of 2025.
Across the 288 variations of the texts for participants, there is only one error, and an instructive one. The California State University Employees Union is mislabeled as belonging to “Field Representatives Union,” the staff union of the California Federation of Teachers.
To this point the California State University Employees Union has been a public-sector union, and have never filed with the OLMS. This tool will go badly wrong if the texts include public sector unions. If you have ideas for a good data source for public sector locals, please let me know.
In the repo, I have put together a training corpus of over 150,000 examples.
The accuracy of this model is the best I’ve been able to achieve, but much smarter folks than me read these notes. Please use the data to make a better model.
Subscribe to get Notes on Labor Data as an email newsletter.