Sai Sreenivas Kodur - Fork to Farm
Welcome to "AigoraCast", conversations with industry experts on how new technologies are transforming sensory and consumer science!
Sai Sreenivas Kodur is the Co-Founder & CTO of Spoonshot, a technology startup supporting food innovation intelligence. Sai is specialised in compilers and machine learning, with a working experience in internet technology companies solving the problems of search and recommendations. Sai worked at Zomato, India's largest food ordering and restaurant discovery platform, as a Software Engineer in search, where he re-built its search relevance and ranking models from scratch to improve restaurant results. Later he worked at Myntra, India's biggest online fashion store, as Senior Software Engineer, leading search and style services and working on various technological functions. Currently, he is spearheading engineering and data science at Spoonshot to build an innovation intelligence and analytics platform for the Food & Beverage industry.
Transcript (Semi-automated, forgive typos!)
John: Sai, welcome to the show.
Sai: Thanks, John. Thanks for having me on the show.
John: Oh, it's great. Yeah, well, you're definitely a kindred spirit. You know, it's even reading your bio I was noticing it's easy for me to say the words in your bio when I start talking about analytics and whatever, no matter how tired I am, I will get those words out correctly. Other words, you know, maybe challenge. Anyway, I think it would be good to get started here with you telling our listeners about Spoonshot. You know, what Spoonshot is as a company and then how it is that you came to found Spoonshot. Kind of take us through your journey.
Sai: Okay, great. Yeah. Spoonshot as it is today is in food and intelligence and innovation tool. So we are predominantly working with FMCG companies to help them spot opportunities and helping them understand the trends from data sets that we build at Spoonshot. So coming to the story of Spoonshot, it was a funny one. So this is back when I was like in 2014 or 2015, I happen to see a paper from MIT, the talks about using a quite basic of computer science techniques or graph theory, such kind of techniques in the world of food, especially representing, you know, the flavor of molecules and ingredients, you know bipartite graph. I think that kind of caught my attention in the sense that there is and then I started to look at what is the most technological innovation or technological application that has happened in food from a computer science point of view. And then to my surprise, I probably couldn't find much in that space. So that kind of caught my interest. Like there is at least I thought there could be some opportunity of applying deep technology to the world of food, because at least when a complete food and beverage article to the other article like banking or fashion. So I felt there is a large, you know a gap in terms of how much technological innovation can happen in this sector and that kind of caught my attention. But after that, I moved on. But, of course, in the back of my head, I always had this idea of how to use deep technology in the world of food?So, yeah I think it was probably the end of 2015 and started 2016. That's when I was thinking of doing something in the cooking space, specifically because I myself am a big foodie myself. I like cooking. And then I happened to meet my Co-founder-CEO Kishan, and then this journey started. So I think when we started under a different name called Desk. And so back then it was about how do you use technology to help people to find dishes that they would like from the restaurant that funding them again, using using the technology of, you know, I mean, getting together tools, that is the culinary part of the world and and, of course, using some computer science and AI principles. So I think that's how the journey started. And then later, we quickly brought it into becoming a B2B business after that. I mean, I think from there, it's been a natural evolution to where we are as a company right now. I think the broad idea still kind of remains the same, being a core data and technology company. Helping food and beverage companies to become more efficient, agile and the innovation and going further, probably it's more to expand to the entire knowledge focus than in the sector of food and beverage.
John: Yeah, it's fascinating and obviously very close to my heart, a lot of what you're doing. So we're going to review the pieces of what Spoonshot offers. I mean, one of the things that really appeals to me about what you all do is that I think you have a very creative list of data sources. That it isn't just the kind of I don't know. I mean, I personally am not that enthusiastic about simply looking at Instagram. You know, to try to get insights. I feel like there are many issues when you just are looking at, you know, kind of basic social media, Twitter, Instagram, Facebook is really, in my opinion, not very inspired research. Whereas I mean, there's just so many problems with it in terms of, you know, there's a big bias who's on social media in the first place. That's already a bias. What are people presenting? Is that their authentic self? Is it just their best self? I mean, there's a lot of issues with the social media data and there's the fact that it's easy for everybody to get it. So everybody's getting the same insights, right? I mean, if there's some trend on Facebook, everybody can see it. And so it's not really deep inside. Well, I think you are doing things that are a lot more creative. So if you can talk about, first off, some of the data sources that you look at. That might be kind of new for our listeners, things they haven't thought about some some ways that you are looking for data that are maybe less traditional than typical sensory and consumer scientists is thinking about.
Sai: Sure. I think that's a good question. I think, as you rightly said, I think though there is, of course, you know, some usage or I would say there is definitely a merit of using social media, just relying on social media sources, but of course, there are a lot of downsides as well. You know, I think first and foremost, if food can be looked at from a various point of use, if you start looking at it from a culinary perspective, for example, food is a very local innovation. So you need to understand how the nuances of the culinary and especially in terms of how the cultural differences are there and understand that from authentic data sources, essentially which of course is available locally. So, of course, part of it is, of course, on Instagram, because everyone lives on Instagram. There is some part of it, some part of that. But other than that, there are a lot of for example, looking at social local newspapers or some of the influences that are there using blogs are some of the local channels that are available, the media outlets. That bunch of things you can think of, you know, and this is only from a culinary point of view. And of course, there are other things to food of course, there are things related to research, especially in terms of patents, the new kind of ingredients, food processing techniques that are coming in. And, of course, there is no denial I would say. I mean, this is where we live. Everyone leaves a digital footprints on the Internet. It doesn't matter whether it is an individual or a business. So I think having an understanding of how the ecosystem is changing overall wants consumers, businesses, and the new competition. I think from a business strategy perspective, people call it as Porter's five forces, right? How do you develop an understanding of what is five forces in a real time or near real time, I think that's the key. I think to do that very well, it is important to look at a long tail of data sources, which helps you spot opportunities much further than you can do when you only rely on social media. Because I think it's fair to say that if something is trending on social media, then that means that's already somewhat known to the world. Obviously, you'll not probably have a much information advantage of doing that today.
John: Right. So things like patents or abstracts used in food science, that would be one example or food tech startup prospectuses, for example, that might be another.
Sai: And there's a bunch of news articles and blogs. I think that's a big, long tail, I would say. And of course, there are some difficulties of how do you make sure that which one is authentic? You can find there is a whole lot that you can take advantage of.
John: And there's a bunch of news articles and blogs. I think that's a big, long tail, I would say. And of course, there are some difficulties of how do you make sure that which one is authentic? You can find there is a whole lot that you can take advantage of.
Sai: Yeah, sure. Absolutely. So I think that's where some of the Holy Grail is in terms of, you know, able to process this data and then making it into a format or a structure that can be consumable essentially. I think there is one underlying principle, I would say. I think compared to when you look at any analytics tools, predominantly a lot of them are do the work on readily available structured data. I think once you have structured data, analytics is possible, any kind of analytics can be possible from that. So the real problem is about how do you build that structured data? And that becomes even harder problem for us for two reasons: one is any data that comes from Internet naturally is unstructured in nature. And of course, we are dealing with an additional complexity of dealing with long sources as well. So that means it’s not just about unstructured data. That is many kinds of unstructured data that is coming from a much higher number of sources. I think coming from that point of view, one of the biggest challenges for us is how do you make sure that the data, no matter where it comes from, no matter what language is spoken, no matter what kind of content is coming from, whether it is video, audio, text, a book, ultimately it has to be converted into one common schema, I would say I mean, let me just give a much simplistic example. Let's say I mean, we want to structure the world of recipes. Recipes, of course, comes in a lot of formats, of languages, all of that. And now let's say you want to put it into like a tabular format, which consists of ingredients, cooking styles, flavor, nutrition, a bunch of attributes. So, yeah, that's where a lot of our work goes in to convert this unstructured data to structure data and build up those data processing techniques, which are language agnostic content and making sure that they all fit into one common schema as an output to its end. So that from there you can of course, once you have a data structure, you can build any kind of applications that opens up a lot of doors for us.
John: Well, that's fascinating. So I would like to talk a little bit about the language side of this. So you're searching in many languages then and then you're have some sort of automatic translation? I mean, I had always I suppose, assumed that you would have to pick a language but now, you're pulling data across many languages, and I agree in it?
Sai: So, yeah, I mean, and that’s a bit of our secret sauce comes in as well. So, yeah, I mean, but these are still well-known principles to any technology company, I would say. I think some of them are to do with how you have a common language representations across. That's one. And most importantly for us, it's more than language. It's actually having that domain specialization. That is what differentiates us. So I think when you say domain specialization, it's about how do you translate the problems of humans? All of us have got into domain knowledge from I mean, in terms of having both commonsensical knowledge and expert knowledge when it comes to food. So having that in place in combination with linguistic techniques is what really matters. I think, of course, to answer your question about multiple languages itself, I think there are a combination of techniques that can be relied upon. I think one of one of them that use is looking at some of the common representations of language there are, which we call as verden brings in the world of data science, which can be built using multiple languages. And of course, we fine tune it using the corpus that we scrape specifically related to food and beverage data.
John: I see. Now, do you generally have one database that you pull from for all your clients? Like when you have a client then that they have an interface. So just kind of you, so everyone understands the kind of full service, typical services, client signs up, they have access to some sort of portal and there's the ability to run an analysis. Maybe you can walk us through what's the user experience when you’re using Spoonshot, how does a user use Spoonshot?
Sai: So I'll talk from how the data gets structured and from how the interface gets done further to. If I to summarize what we do from a tech point of view at Spoonshot, I think we are structuring the data of food. We have developed that engine essentially restructures the data of food and then unified data platform, I would say, which is just think of it as an ever growing data platform containing all kinds of new data that is coming on the Internet and ultimately getting structured into one common place. And once you have that, that's the core output of our technology. And of course, we have built a multiple tools that you users can use from a UI point of view to kind of deal with that data or interact with that data. Ultimately, think of it as an experience as you use any typical analytics tool where you actually have a bunch of predefined set of, you know, whether it is a filter or whether it is some kind of credit, predefined credit that you can execute R. Ultimately lets you cut the data in a way that you want in terms of, you know. So in a very similar way, our food and beverage professionals can, of course, ask more domain specific questions. That's what our tools enable them. Whether you want to ask culinarily related question, whether you want to ask a trend related question or consumer insights question doesn't matter. The tools enable you to translate your part into interacting with the data using the kind of tools that we have built on top of the data. So in a simple way, it just helps you cut the data in the way that you want to so that you can find the answer for the question that you have.
John: So, for example, if I was operating a particular product category, I might say, okay, within this product category, I have this ingredient. What are other ingredients that are people are, you know, that we're starting to see? You could do an analysis and see which ingredients are being combined with the ingredient you're interested in? That kind of thing? Will be a kind of typical, yeah, so that people can specify the groups. I mean, I think this is fantastically valuable and I think that something that you and me, we've talked about before and that I think is really exciting is the possibility. So where you now and what are are your thoughts for the future in terms of empowering that kind of food scientists to combine their data with the data that you've collected? I mean, I think that's where there's a real opportunity for specialized insights. So what are your thoughts on that kind of going forward?
Sai: Yeah, I think there are a bunch as we stand today, we are looking at the front end of the innovation spectrum where there is more kind of the R&D and the traditional research kind of work that people used to do before using some that I think is the first leg of doing that, using more data driven tools. I think that's where we are now. But our idea is to extend much further. That is just from our FMCG point of view. But I think the broader picture, I would say is I think the world that we live today, there is an increasing number of knowledge workers around the world. And in a very simplistic way and my definition of knowledge workers is someone who uses data or intelligence to do their job. It could be as simple as doing a Google search to, you know, as complex as using some of these advanced analytics tools or anything in between. The same thing is true in food and beverage industry as well. But the problem is there are no domain specialized data tools that helps these knowledge workers to do their job. So I think that's the overarching idea. I think that food and beverage, I mean, there's no surprise. It's a giant sector. A lot of businesses in that sector starting from restaurants, hospitality to FMCG. As of today, we are currently more focused on the FMCG side of the spectrum. And back to on the front end innovation intelligence. From there, we, of course, want to expand it to the entire spectrum, you know in terms of the next, you know, whether it is food scientists, the next step, packaging intelligence, pricing, intelligence, so on and so forth. And of course, also finding new opportunities in terms of even expanding into the other businesses. For example, an adjacent innovation, opportunities also needed in cloud kitchens, especially given the fact that now restaurants have become more remote in terms of their operations. There's not much customers visiting a physical restaurant anymore. I think the only way a cloud kitchen can differentiate themselves to other cloud kitchen is through a menu and innovation, which is more targeted to the needs of their local consumers around. So I think there is a bunch of opportunities that we see on it.
John: And I think that's definitely a really good point. So all right, so one thing I do need to ask about is because, again, this is sometimes the best questions are the selfish questions. So here's my question. So you have all these data now. What are you using to organize all of that data? I mean, of course, I think the data cleaning is very interesting. But once you have the data in some sort of more structured form, how are you organizing it? What are your kind of preferred tools for knowledge management?
Sai: I think there's a mix, I would say. Ultimately, I think if I understood the question correctly, it's more about you're asking what kind of tools are needed for us to organize this data and then make it possible to build the tools that we built?
John: That's right. Yeah, because from what I understand, you're not just, say, using the 4G or something like that, right? You have I think it seems like a more elaborate kind of data engineering solution than just a straight forward graph database. If I understand your technology correctly.
Sai: Yeah, that's true. Actually, because, I mean, we deal with a lot of data to just give you a number as we collect more than 200GB data. Well, for 3-4 days of a week . So I think we are already at how close to 10 to 15 terabytes of data as we stand to. So yeah, of course. You know, I think this data is only growing in number because as our tools get better in collecting data at scale, this data only explodes in terms of the variety, in terms of the volume, all of it. You can think of the big three or four ways that people talking big data; volume, veracity, velocity, all of that. Right? So, yeah, coming back to the point, of course, I mean, there is a lot of both engineering complexities and data science complexities while we're dealing with this data coming specifically to engineering complexities. Of course, making sure that there is a concept of something on which we follow in our company called instant data. Like, you know, what is mean is the real value add that we have to our customers is the data that we bring in. Right? So without any new data that is coming in, there is no new value added to the customer. So will we look at in that sense? So the challenge for us is how do you process the data as soon as possible and make it available to our users and transform the data in that process, all of that. So I think some of the tools, of course, we use so understand big data tools like, you know, starting from object storage and a bunch of SQL and no SQL kind of databases. We are heavy users of Spok. And then, yeah, we of course use new 4G, those kind of tools as well, you know graph databases. Ultimately, I think when we are building the tool for our clients, it's more about what kind of data storage tools can scale for the queries that are getting executed against them. And so it's ultimately for a technology company, typically there are some primary databases and secondary databases. And primary databases is where they serve as a source of truth. And secondary data is where they're more built for the performance in terms of the queries, in terms of how the data interaction can be made much faster and so on and so forth. There are a bunch of design choices. I hope that kind of answers your question.
John: That's right. And I think it's fascinating. I mean, I feel like, you know, it's important for our listeners to understand that you're solving a pretty serious data engineering problems. That it isn't, you know, it is not a simple thing to take all the data, process it and to have this much data to be turned around so quickly. So what sort of latency is it? I mean, what's the delay between you collecting data and the data being available to your users or is on the order of weeks, days, hours? How quickly do the data get processed and become available to the users?
Sai: Sure, as of as of today, we pretty much do, we were working on batch model as of today, so that means we let the data get accumulated on a daily basis and then we run the transformations of the data at the end of the day.
John: I see, so pretty much every day a user wakes up and there's new information that's being collected, more or less. Yeah, this is really great. Alright, Sai, well, we are actually out of time. So I think this has been extremely interesting. And I would definitely recommend to our listeners to check out Spoonshot. I think you all have a really nice product. I think you're bringing real value in the kind of wide net you're casting, you know, as you described the long tail of data that's out there that's out there. So how would people, if they want to learn more about Spoonshot and they want to connect with you, what are some ways that they can reach out and find you or find Spoonshot?
Sai: I think, of course, for all the basic information you can see on our website, spoonshot.com. We have you know, you can subscribe to our white papers as well. We write a bunch of white papers every month. There is the library, of course. And you can see a bunch of content that we have on YouTube as well on Spoonshot channel on YouTube. You can see similar kind of research that we publish every month. And if you want to contact us, of course, there is always a contact button on spoonshot.com. So I think that's the best way to reach us.
John: And to reach out to you personally, what's the best way for someone to reach out?
Sai: LinkedIn is this one of the best ways or you can you can email me on my official email address.
John: Okay, that sounds great. Alright, Sai this has been really good. Do you have any last words of wisdom? Anything you want to share with our listeners before we wrap it up?
Sai: Yeah, I think it's really exciting the time that we are living in, especially given the fact that food and beverage industry as a whole is facing a lot of challenges and a growing population to nutrition, nutritionists, meals to teas to be sold to everyone, so on and so forth. I think there is a big role of data to be played from the end of the supply chain to, you know, understanding the customers. And then, of course, people talk about farm to fork. But I think from a data point of view, it's actually the fork to farm that needs to flow backwards and make it available to understand the consumer needs today and tomorrow and make it kind of use that data all the way to its supply chain. I think that there's a lot of efficiency that can be brought. There's no other way other than using data driven tools that kind of helps us to get more efficient over time.
John: I really like that, fork to farm. I think that's going to be the show title, so that's very good. Alright, excellent. Thanks a lot, Sai. I appreciate you being on the show today.
Sai: Thanks, John. Thanks for having me.
John: Okay, that's it. Hope you enjoyed this conversation. If you did, please help us grow our audience by telling your friend about AigoraCast and leaving us a positive review on iTunes. Thanks.
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!