pulley

In Canada, all the natural resource exploration reports are publicly available. Government websites store thousands of PDF reports, each stating something like, "this specific piece of land contains X ppm of material Y"

But there's a huge problem in the Canadian mining industry: if you want to mine a certain material in Canada and identify exactly where you should mine, you currently need to manually read through these reports. Some of them are as old as 60 years. Right now, every mining company is stuck doing this, and it's incredibly inefficient.

What I worked on was developing a tool that processes all these documents (even the 60 y.o. ones). I built an LLM-based agent that scans these documents, some of which are scans of 40-year-old papers, performs OCR, and then uses specific heuristics to understand the relationships between different material probes within the document. The agent intelligently parses the document to extract detailed information about material concentrations found in soil samples. After extraction, it fills out a structured JSON schema and uploads this JSON into a MongoDB database, making the information searchable.

The agent achieved pretty good accuracy (~90%), but running it on a large scale was expensive. There are hundreds of thousands of PDFs just for a single province, so processing the entire country would cost millions of dollars. To solve this, I built a web interface where a mining company can simply go to a map, draw a polygon around the area they're interested in, and this action triggers a scraping agent. The scraper agent first collects the relevant documents and passes them to another agent, which then parses the documents into a structured format and uploads the data into the database.

The workflow is straightforward: a user selects an area on the map → the agents start working → a few minutes later, the user gets a notification that the area is processed and ready to be searched. Users can then open a web interface and perform detailed queries like "sort areas by gold concentration in ppm."

Once users find the material they're interested in, they can directly access the original PDF reports, study them thoroughly, and proceed to mine the materials (of course, after obtaining the necessary permits first!).