Alpine Intelligence

by

Nick Jaouen

nickjaouen

Service specialties

Web App SolutionsSolutions

Recent Projects

Loading
Screen recording of the JHDash! app showing transit, weather, snow depth, and daily message features

JHDash!

Until Recently, Jackson Hole's bus system required people to use spreadsheets on their website in order to find bus times. In the past couple years, Google Maps added the bus routes and an app called Transit started supporting Jackson Hole. I was growing frustrated with the user interfaces of both of these, so I decided to build my own, no nonsense, app for finding bus times. The idea has grown over time, and I now have a collaborator who has assisted with some important features. It started with just the bus and some weather. Now it shows the bus times, weather information, snow depth at Jackson Hole and a message of the day. Once a day, a CRON job runs that sends the day's weather, a list of events for the day, and a couple recent messages of the day to an LLM API endpoint to generate 6 message variants. The messages vary in length and tone. All are light-hearted and try to keep the user informed about what is going on in the valley on any particular day. The event data comes from "JHDash - Events!", a separate python app.

JHDash! is build using the Vercel NextJS stack, with Typescript and Node for the front and back ends, and Prisma for the Postgres database. All background photos were taken by me. Components are a mixture of custom built components and Shadcn components. It is deployed on Vercel and is available at JHDash.com

JHDash - Events!

JHDash Events! is a Python-based project utilizing LangChain and LangGraph. A LangGraph StateGraph is used to create an agentic flow that starts with a URL for a website listing events. Playwright is used as a tool by a Large Language Model (LLM) to load the page and scrape it. This information is fed into an LLM, which extracts event names, locations, dates, descriptions, and event page URLs.

The list of events is then broken into individual events, and the main pages for those events are scraped using Playwright. The data is then passed to an LLM to enrich the information already gathered about each event. Some events can be part of a series, such as a summer concert series. In that case, the LLM identifies the event as a possible series, and series candidate records are created for the individual events in that series. Those events are then researched in the same way as any other event. The final list is loaded into a Postgres database for JHDash! to access.

Throughout the process, agents are used to identify duplicate events and to standardize the names of events, organizations hosting events, and event venues. Retrieval-augmented generation (RAG) is utilized to search for existing canonical names in a Postgres database via vector search. Candidates are then sent to an LLM to decide whether they are matches. RAG is typically used as a method for augmenting the knowledge available to an LLM, but here it is used as a method for ensuring data quality.

Tavily web search is also used throughout the process to perform searches for canonical names of organizations and venues using their official websites. Tavily results are passed to an LLM for inspection.

Hard-coded rules are kept to a minimum during the process. It’s easy to fall down the rabbit hole of building complex hard-coded rules and lists of exceptions to handle every scenario and outlier. This process eschews those older methods of complex rules and functions and instead focuses on utilizing large language models to make decisions about how data should be handled. The results are surprisingly good and require far less code than a traditional deterministic approach.

The project was initially built piece by piece in several Jupyter notebooks. It was then transfered to Python scripts that can be run automatically at command line. The process runs locally on a MacMini and results are stored in the same Primsa database that JHDash! uses. There is a table for Canonical Names and aliases, along with 1536 dimensional vector embeddings that can be searched. The other table is for Events. The event table contains a composite description column which has a corresponding vector embedding that can be used for vector searches. Since vector embeddings represent the meaning of their content rather than the exact words in their content, they are ideal for finding similar names that may contain typos or be written in different ways.

View The Project on GitHub:
JHDash - Events!

Contact

Need help integrating AI into your organization?

Do you need assistance with data engineering or app development?

Contact me here: nick@alpineintelligence-llc.com