Briefer
All posts

Self-serve dashboards

6 minsJun 3, 2024authorLucas da Costa

Sales pitches are the only place where “self-serve dashboards" work. In the real world, it's a different story.

That story usually starts with an engineer or data scientist who's frustrated because they spend too much time writing queries and preparing dashboards for business people. They think that if they make BI easy enough, everyone will be able to “self-serve”, but that rarely ever happens.

What actually happens is that engineers and data scientists end up being the only people using the "self-serve" BI tool, which, ironically, makes it not self-serve at all.


Why self-serve BI doesn't work

SQL is the only self-serve BI tool. Still, most "self-serve" BI vendors don't want to admit it, so they find ways to reinvent the wheel and disguise SQL as something else. Maybe that's what customers want to hear, or maybe they genuinely believe it, I'm not sure.

Either way, the problem with most approaches to "self-serve" BI is that they consider SQL to be the only barrier to business stakeholders querying data. That's not true.

Let's say you could wave a magic wand and make writing SQL queries as easy as writing an email. Do you think business stakeholders would suddenly start querying data? The answer is no.

Even if these people knew how to write SQL queries, they wouldn't understand the semantics of the data they're querying. They wouldn't know what the data means, where it comes from, or how it's calculated. They wouldn't know how to interpret the results, or how to validate them. In other words, writing the SQL query is actually the easy part.

Now, let's look at two attempts to make BI "self-serve" and see whether the hypothesis above holds.


Attempt 1: The conventional "dropdowns and checkboxes" approach

First, let's take a look at an example of a conventional "self-serve" BI interface which uses dropdowns and checkboxes as a way to make queries more accessible.

The first thing to notice is that this interface is just an attempt at what I call "SQL-by-mouse". I might be just an old grumpy person, but I don't see how this is any better than writing SQL. In fact, it's worse because it's slower, less reliable, more limited, and not generalizable to other tools.

In any case, I'll concede one thing: some non-technical people are just scared of any monospaced writing with syntax highlight, so this interface might be a good first step.

Anyway, let's ignore all these problems and just assume that this interface is the best thing since htmx. Let's assume that it's 100x easier to use than SQL.

Even if that were the case, your CFO, for example, wouldn't just go ahead and use these dropdowns and checkboxes to query data. The reason they wouldn't use it is that they don't have the context to understand the data they're querying, and they'll probably be insecure about the results, which they won't know how to validate because they don't understand how the data got there in the first place.

Now, I've got a question for you: if your CFO doesn't use this interface, who will? That's right, you will. And guess what's the first thing you'll do when you get to these dropdowns? Correct, you'll immediately look for somewhere to type in SQL. Oh, the irony.


Attempt 2: The text-to-SQL approach

Remember that magic wand that makes writing SQL queries as easy as writing an email? That's what text-to-SQL tools are. Nonetheless, they're not enough, just as I explained earlier.

Again, the problem is not the technology itself. In fact, LLMs are almost too effective at translating natural language into SQL. They will find a way to generate a query for any question you ask, even if that question doesn't make sense, which is exactly the problem.

On the other hand, a technical person would notice that the question doesn't make sense, and they would ask for more context. They would ask for details about the business person's hypothesis and the problem at hand. Then, they would explain what type of data is available, and work with the business person to formulate a precise and useful question.

Again, SQL is not the problem.

In any case, I do think LLMs could be the actual solution to self-serve BI, but not in its current form. For them to work, they'd need to be fed with more context, and they need to get better at expressing uncertainty and asking for more information.


What actually works

If we assume that the problem with self-serve BI is not SQL, but the context and semantics of the data, then it follows that the solution is to teach people about the data they're querying, regardless of interface.

The problem with this type of training is that it takes time, and most business stakeholders need answers this week, not next quarter.

Even in the best-case scenario, where companies train business stakeholders to understand the data they're querying, these people will need time to keep up with the changes in the database's schema, data models, and ETL processes.

Additionally, documenting all this knowledge generates significant overhead for the technical team, and it quickly gets out-of-date.

So what's the true solution to self-serve BI? The answer is simple: not to make BI self-serve for non-technical people. Instead, the solution is to make technical people support business stakeholders and help them do it more efficiently, using better tools.

The definition of a better tool may vary, but I have a few suggestions.

The first suggestion is to give LLMs to technical people, not business stakeholders. Even though text-to-SQL is not quite there in terms of understanding context and semantics, technical people already have context and semantics, so why not let them use it? That way, a single technical person can serve more business stakeholders, and do it faster.

The second suggestion is to give technical people more flexible tools. Instead of giving them dropdowns, or a SQL-only interface, let them play around with data using Python, R, or any other tool they're comfortable with. This way, they don't need to be passing data back and forth between different tools, and they're not limited by the capabilities of a BI tool.

The third and final suggestion is to make it easier for technical people to share their work. Notebooks and internal data applications are notoriously bad at that because they require others to deal with containers, dependencies, and infrastructure. It's a lot of overhead for a business stakeholder who just wants to see a chart. Instead, we need tools that have all the building blocks of a data application, but that are as easy to share as a Google Doc.

I'm obviously biased because I'm building a data tool for technical people, and it does exactly these three things. But what other choice do I have? I'd rather build tools that work than tools that don't. And, most importantly, I'd rather not pretend that the problem is something it's not.