Meet Datrics: the No-Code Machine-Learning and Data Analytics Platform
StartupYard Remote Lab launched in June 2020 and today it is our pleasure to present the first startup to have graduated from this unique, fully-remote program. Datrics helps FinTech companies build and deploy machine learning solutions without coding. Other industries can also benefit from Datrics powerful interactive environment. We sat down with Anton Vaisbrud, CEO and co-founder of Datrics to learn more about his product and his plans for the future.
Datrics is the first StartupYard company, so far, to have been accepted in the prestigious YCombinator program.
Hi Anton! Who is Datrics for, and what problem are you solving?
Datrics helps FinTech companies build and deploy machine learning solutions without coding. For instance, one of our customers is a consumer lending company, who created their new risk model using just our drag-n-drop interface.
Spreadsheet analytics and data-prep are as time-consuming as advanced machine learning, and spending 400+ hours yearly on repetitive tasks is not unheard of. So we’ve built a cloud-native solution for Citizen X – Scientists and Developers, with a simple and intuitive web interface and emphasis on ML components. We have broadened our aim to cover analytics process automation for advanced spreadsheet users, automate ETL (data prep) visually and all this without coding.
Our audience is Data Workers (1 billion in the world!), ranging from analysts, engineers, and domain experts who want to have a short path from raw data to insights and decision-making, automate simple analytics and ETL, to machine learning engineers and professional data scientists, who want to focus on data processing and model creation rather than spend time on engineering tasks.
FinTech is the core vertical for us, as here the data is a core of the business, and a fraction of people who are working with data is larger (50+% of the organization usually). We help to build risk models, transactional analytics, forecast demand, or better target marketing campaigns without coding.
What prompted you to start developing Datrics? What did you see as fundamentally wrong with the way machine learning and analytics work right now?
We realized that it usually takes months of work to turn an idea into a complete solution, going through data snapshot gathering, data cleansing, experimenting and working with engineering and DevOps teams to turn experiments in jupyter notebook into a complete application that works in production.
Because of this, we started working on the no-code machine learning platform for everyone: a tool to cover experiments and MLOps. Then the idea evolved, and we decided not to limit ourselves to machine learning, because knowledge workers need a tool to do analytics and then machine learning on top of this, so we are covering the end-to-end flow now.
There is nothing inherently wrong with machine learning itself. The problem is that a lot of everyday operations are taking considerably longer than they should. Also, most people who are doing data science are not proficient in coding or software engineering, so doing things like deployments, APIs creation, and productionalization, in general, is a hassle. Datrics helps with all this: users only need to focus to apply their domain expertise and data for the end goal while we are handling everything related to the infrastructure, help experiment and iterate faster, and ensure moving to production is always just a few clicks away.
How big is the team and why are they a good match for this project?
Over the last year, thanks to our fantastic users, we’ve grown to a team of twelve (and growing further!) and moved the product from alpha version to production, and we are now onboarding new B2B clients weekly! The team is perfectly in tune with each other, and we know exactly what we are building together. Our founding team has experience driving a 30+ people data science consultancy and founding outsourcing companies. The data scientists in our team have built dozens of production ML solutions, and we know the struggle it can be. What we are building is a tool that we are constantly using ourselves to create pipelines for our users, as well as a tool that is very easy to use by a person with a very limited ML skillset.
Your team comes from the consulting world. What have been your biggest personal or professional challenges in switching from consulting to building a SaaS product?
There haven’t been too many challenges because we’ve been working on data science solutions for many years now, and we do know what are the main challenges when developing an analytics solution, now we just manage the process of solution architecture and development for our own needs. The same approach here – make your end users happy, do customer interviews and iterate fast!
We know what parts are the most time-consuming, what parts should be automated, and what things are most important and need extra attention, as we’re the users of our product as well! Even though we’re building a SaaS product, we’re still focusing on solving real-life problems and providing business value to our customers, based on the domain knowledge and use cases we know, such as Credit Risk Modeling, Churn Forecasting, etc.
We are democratizing the beautiful field of data science and making our users reap the benefits of analytics and machine learning sooner and more easily.
Can you tell us a bit more about how your technology really works? How unique is your solution?
Long story short – you can think of the Datrics platform as an additional graphical layer over Python and covering all the parts of functionality one might need for analytics or ML-related tasks. There is a heavy emphasis on automation and ease of use, and we are empowering our users with the possibility to work faster and more efficiently.
The tool itself is built using mainly Python, and we are using Dask in order to allow distributed computing. This allows us to work with big data easily and efficiently while having an added bonus of access to a huge community and a set of available solutions. Additionally, if our users want to – they can easily develop their own “custom bricks” using Python and still benefit from all the infrastructure and automation aspects of the tool.
There are a few other interesting things about our product that differentiate us from other solutions. One of them is – we’re cloud-native, so literally no local installation is required to use our product. You can just use your browser. As an added bonus, since the instance of our tool can be in the same cluster as the data, users can benefit from improved data read/write speeds, and there is no need to worry about security since data never leaves the cloud instance. If there is a need for that – we can even install the tool on the client’s physical servers, as we’ve done for one of our customers operating in the banking industry.
Also, we are focusing heavily on collaborative mode to use our platform. Different people and even teams can share parts of data, pipelines, or visualizations with each other. We are keeping track of versions of pipelines created, models trained, and APIs deployed. There is a possibility to restore any pipeline and have total reproducibility of the results, which is helpful for regulatory purposes as well.
The last main distinguishing feature is the availability of ready templates, so users do not need to start from scratch when building their own solutions.
Our solution is not the only one on the market, but the set of features we’re offering is quite unique, covering end-to-end analytical flows.
So, cloud-native, distributed compute, templates for specific vertical use cases; BI, data prep and ML/MLOps in one toolbox!
It sounds like any industry could use a solution like yours. How much of a problem is that for a small start-up? Are you able to focus on one segment or are you serving anyone who comes through your door?
Exactly, the tool itself is industry-agnostic; it’s just that we’re focusing on the industries we have the most personal experience with. I wouldn’t call that a problem, rather a very interesting challenge. Building a tool that can solve many problems efficiently is challenging but is extremely rewarding at the same time. Being a small start-up has both advantages and disadvantages. The main advantage is that we can iterate really fast, and it takes days when we need to create new functionality for some of our users. One other thing is that we’re working with early customers very closely, ensuring that we solve their pain points. The main disadvantage is that the product is very complicated to build, so we need a very mature team of diverse talents(and we have an amazing one!), and building a tool like this by a tiny team is impossible.
Regarding the focus on the segments, we have many years of consulting experience, so we can help pretty much anyone working with structured data. We are using our platform to solve use-cases in quite a few different industries, e.g., logistics, healthcare, retail/e-commerce and manufacturing. That being said, we are mainly focusing on the financial sector since we have the most experience there, and we’ve solved most cases in them using our platform so far: Credit Risk Analytics and Modeling, Customer Segmentation, campaigns personalization, transactional analytics and anti-fraud.
You have been accepted in YCombinator. What are your expectations and goals?
YCombinator is amazing and has satisfied most of our expectations already. We’ve grown significantly (9 times in revenue!) since applying to YC and are still growing confidently. YC has helped us narrow our focus and concentrate on the things that are most important, especially in the earlier stages of a company – build something people want, talk to our customers and iterate faster. The community is very friendly and supportive, and a huge amount of successful alumnus helps with advice and help, especially in reaching potential customers.
Is Datrics available now? What do people need to get started with your service?
Absolutely, you can go to our website and signup for free. We have a wiki and YouTube channel where you can find information about how to create your first pipeline and how to use the tool. The free version is somewhat limited in its capabilities, but you can easily get an understanding of how the tool works and how it can be applied to your specific needs. For commercial usage we do private cloud installations, please reach out to us firstname.lastname@example.org and we’re happy to get you onboarded!
Has there been a major surprise for you since joining the StartupYard Remote Lab program? Did you learn something you weren’t expecting to?
The major thing is that the program is very flexible, fine-tuned for your needs. Every time we had a question, we got an answer very quickly, without general advice, but tailored for our situation and needs, and this is what we are grateful for.
We were able to talk to many people and look at the company from a business standpoint, not only technical – which was a great step forward to build a state-of-the-art tool for analytics and machine learning!