About this webinar
Welcome to this insightful webinar! The topic of the webinar is “The Diary of the CTO.” We will be delving into a very hot topic of “the challenges of data integration with small and medium-sized businesses”.
The webinar is presented by Monica Holland, a go-to-market professional with a strong commercial background. Monika will talk with William Da Cunha, CTO of Marjory, expert in data integration and Middleware solutions.
Speakers
Monika Holland
CCO at Marjory and PLG Expert
William Da Cunha
CTO at Marjory and Data integration Expert
What is Marjory
Monika: William, would you like to introduce yourself?
William: Yes. So, I’m the current CTO of Marjory. Marjory is an iPaaS, an Integration Platform as a Service. We are strongly focused on the SMB market because they have very specific needs and challenges that we aim to address. We provide all the necessary tools for integration in the context of SMBs. Whether it’s ETL or fully event-based automation, whatever you need, if it’s asynchronous, you can handle that with Marjory. On my side, it’s almost a decade since I started in the integration industry, and before that, I was a Java developer for a few years. So here I am.
How to become a CTO
William: Yes, truthfully, I’m really passionate about coding and software engineering. I started when I was a kid, so I’m not too old, but it was a long time ago. When I started professionally, it was still the case that I really wanted to learn a lot of things, the quicker, the better for me. I was really excited about training. So, I got into front-end development, and back-end development, and then, I had the opportunity to start developing data flows and working on data integration. I started focusing on that because I really love the technical aspects. One of the hardest things is architecture because you deal with a high volume of data, and you have to handle availability. It was really interesting, and it’s always different because each company has its own set of data and its own data flow. Even if you can look at the patterns, you can actually see that there are patterns. It was a very broad domain, and I really loved that. Actually, one of the industries that use a lot of data integration is marketplaces. So I started with one marketplace, two marketplaces, and I worked for several of them. And Marjory was a solution dedicated to marketplaces at the beginning. As I said, it’s really an industry that needs a lot of data flows. You have a lot of sellers that are not inside the marketplace company, and you need to exchange data with them. You need a full catalog and unify the experience for the customer. So, I started with Marjory collaborating because I had experience in data integration and marketplaces. So, a lot of passion, and a lot of knowledge combined, and here we are.
Why Marjory?
Monica Holland: What excites you the most about your current position? You could be a CTO in any other data company or tech company. Why Marjory?
William: Every company, I don’t know, but sure, it couldn’t be elsewhere than Marjory. And first of all, my team. I am very lucky; every CTO knows that we can’t do anything without a good team. They are dedicated and passionate people. Truthfully, what we’ve built as a product and the underlying technology are very good. It’s really powerful, and it’s cutting edge. We use technologies that are not just hyped, but what we can see on the market and everything like that. But we’re really using new and modern ways to work. And last but not least, our CEO Kamel Tansaout places a strong emphasis on innovation. Even though we are young, Marjory was created a few years ago, we don’t just try to imitate others or get inspired by others; we try to be better and innovate. So, that’s why I love my position at Marjory, and I don’t want to change.
Challenges of Data Integration for SMBs
Monica Holland: All right, so you’re focusing on innovation and innovating the space. Obviously, data integration and management have become a must-have for every company now. It’s not a nice-to-have anymore. And obviously, a lot of small and medium businesses have a lot of different challenges in this area. Which is the main topic of today, the challenges they face and how they handle data.
Limited resources
William: There are many challenges, obviously, and what is the most challenging point for me? The first, by far, is limited resources. A lot of CTOs have worked or still work in big companies, and the way you handle resources in a big company and in an SMB is different. It’s not that SMBs lack resources, but we need to be really efficient. It’s really important in an SMB because it can make or break the company. The way you treat and manage your resources is fundamental. It can be human resources with expertise and financial resources. In order to work on data flows, the first challenge is how to get the right tools and the right expertise. Expertise is something that will enable other challenges to be met because if you have the right tools and the right expertise, data flow is something we’ve been able to do for years. But expertise is essential.
The Significance of Scalability
William: I think the second point I can expand on is scalability. Scalability is always an issue, whether you’re just starting out as an SMB or a big corporate company. It will always be a subject but for SMBs, scalability is essential, more so than for larger companies. In my opinion, scalability is really a support for the business plan. All your infrastructure and your system are there to support your business. I’ve seen the tendency to be more aggressive in business plans in SMBs compared to larger companies. It’s pretty rare that big companies double their revenue year after year, but SMBs can aim for that. In this case, scalability is important.
Scalability is always an issue, whether you're just starting out as an SMB or a big corporate company. It will always be a subject but for SMBs, scalability is essential, more so than for larger companies. In my opinion, scalability is really a support for the business plan. All your infrastructure and your system are there to support your business.
As an SMB, you need to grow. And generally, what I see is a lot of ETL jobs, manual scripts, or even using tools like Talend Open Studio, for example. It works well until you reach a point of growth where you can’t just use this kind of tools and you can’t be blocked by one paradigm. So, CTOs need to manage several kinds of data flows. Obviously, we still use a lot of jobs, scripted ones, data prompts, but you need to also handle automation, event-based ones… . And these particular cases CTOs need to be careful about how to avoid scalability problems. It can be because of servers and infrastructure, but it’s rarely the case because you can still put more servers or more powerful ones. But it’s the ways you handle data flows at scale that are not really well-known enough.
How to put one data flow in production is generally well documented. But having these data flows running at scale with millions and millions of events, each day and being highly available, that is challenging.
Monika: So, how would you say, what are the most important things to look at? First is infrastructure?
William: Before the infrastructure, it is the architecture. The difference that I mean is I’m not looking at the power of the server or how many servers. If I need another server, or to grow a server, will it still work the same way? You need to have an architecture where you know you can grow your infrastructure, in order to support your business plan.
If the first thing is architecture, you need to have the right expertise. Even if it’s not in-house, there are skills to be had, and you need help if you don’t have it within your company.
The Role of Data Governance in Data Integration
Monkia: Very interesting. So we said resources, we said scalability what else is there in the main? What would you point out as the main challenges?
William: I think I would go to data governance because everyone needs to know what’s happening inside his system. But when you start, it’s usually a topic that you really don’t cover very well, and it’s understandable. You need first to have a business in order to ensure that this business is still running and to secure it. Data governance is a vast subject, but there is one main problem that you can check:
Data Silos and Their Impact
Data Silos means that there are a bunch of data across your organization, that is not fluid in the sense that it’s not circulating and everyone that knows that particular set of data even exists in your company. And the main threat is that you can’t grow as fast as expected. You’re really cutting yourself from a lot of opportunities when you’re creating your own data silos. This challenge can be resolved. It’s usually not a technology or a tool problem.
The first thing is generally for data silos is an organization problem. From what I have seen, usually, we can see how a company is organized and how it runs, just by taking a look at the architecture and the infrastructure. At times, it’s really common to understand the choices made. Not because it was a good thing for architecture but because of company choices. So to eliminate data silos, it’s first an organization problem more than a tool problem. For example, this component is here, but intellectually it should be there. It would grow very much more easily. It will be more performance, more usable, and it will break some data silos, but your organization needed another way. So for data silos, obviously your tool should help you, but it’s a help. It should come first from your own organization. If you own an organization, create a silo, you can do whatever you want, you will still have these problems.
But if we think about how a tool can help you? It can be by giving you a real overview of what is flowing, because these tools are transversal.
In the way to help you handle your organization. For example, if you have two, three or four teams developing data first, maybe they want their own administrators and you want to replicate your own organization and work as one of several teams or even business units inside these tools in order to replicate in what parts your organization and in the other parts and the data that is flowing. I think that a good solution for governance should be able to give you a real overview of what kind of data, which source and which solutions are connected.
Monica Holland: The team or the resources, and the skill set that, are part of the organization play a huge role here. More than we would think about. Everything else?
William: The tool is just a tool, first. You need to be sure about your capacity to use this. If you use a hammer to do anything else than what you’re supposed to with the hammer. It can be the best hammer in the world, you won’t have the result that you’re waiting for.
Humans are the first importance for sure, and usually it’s not something that organization for the strong and faces on. Because, as I said, that, the trap here is to think, “Okay, it’s easy to do data flows”. For sure, there are a lot of data flows that are running each day, but if you want to scale, you can be blocked.
You can also think about the position of the SaaS solution. Because actually, SaaS solutions are more and more used in all organizations. And it can also contribute to, I think, data silos that we are all concerned. Everybody uses solutions or Google spreadsheets a bit shady, the IT Department doesn’t know… There are a bunch of SaaS solutions that you are concerned from there and you need to have some tools that ease the integration of these SaaS solutions.
There is a lot of tools that maintain a good scalability like Make, or Zapier… For sure. We’ll talk about Marjory Solution that there is so solutions that was made up here.
How to maintain Data Quality
Monika: So, that SaaS solutions that companies use on a daily basis. How does that affect the quality of the data flowing through those solutions?
William: That’s a good point. The second point of data governance is data quality. Data quality can be influenced by the lack of automation because if you have a solution that is not connected to other pieces of system, it means someone somewhere needs to do something manually. And as you know, doing things manually can directly generate errors or some issues. So that’s one way to provide this kind of constraint. And then, there are a lot of SaaS solutions that generate data (for example a CRM).
There are some cases where you want to have more control of data produced by Saas Solutions. And tools that interconnect solutions can be a good way to do that.
AI in Data Integration
Monica Holland: Absolutely, and in your experience, you’ve also talked about AI in data integration. Could you share how AI plays a role in data integration, especially in the context of Marjory?
William: AI is a vast topic. The hype around AI is really related to ChatGPT, Bard and auto-generated AI. These AIs can help you a lot when you’re creating the data flows. Usually, it takes some time to code the data flows, but the most important part is not the development itself, it’s more about understanding what you should do before developing. That’s why, even if you code for one or two days to create data flow, it can cost you four or five days just to understand how you will connect these applications or have this new kind of information circulating.
So auto-generated AI is a huge opportunity to help write documentation and specifications you have to do before coding. So you can save 30% or 40% of the time of a data integration project.
Monika: So these influence your strategy or how you approach data integration at Marjory?
William: Yes, for sure! And we are currently working on our own generative AI for the end of the year. We try to not use a prompt because as I said, we try to be innovative so we don’t just want to do as the others. We were working on, what AI can do other than “just” generation code.
I think of two things that we’re currently working on, in R&D:
- – The first is: a huge part of the work, is knowing how you will connect and you can have the good specification. Auto-generative AI will use this work to create data flows. That is the way it is used for the moment, making the connections. But for the specification itself, AI can be a good thing because AI can know about your company by catching a lot of documentation from your Confluence, your Google Docs and so on and learn about how your company works. Also, AI can learn about best practices of data integration and propose some specifications that you can just review and adapt if needed.
- – The second idea, I think it’s the better opportunity for AI is during the run. You need all to understand that the run is what matters the most. You need to be good at running data flow, and AI can help you to be more proactive. Before having an incident, being able to detect it for example: no order in my ordering system since five minutes ago. That’s something that can happen. So notification AI said that maybe there’s something wrong. I think that there is a huge opportunity in the run segments because you see everything that is not running your system.
The Importance of Run in Data Integration
Run as the Core of Data Integration
Monika: So the run is what really matters, what’s important to look at. Can you walk us through more on that?
William: I think the first challenge is the visibility. I really talked about scalability which is a run problematic, but in the run, I think the visibility is the most problematic topic. Because If you’re in ETL jobs, it’s really easy. Forget about observability because a job is something that is triggered at this time, it works, or it doesn’t work. And almost all the tools that are used that I know work really well. You just have to ensure that your integration system is running. It’s something that we can do in run since 10 or 20 years ago, but it’s really an old subject. It’s really easy to understand if something is working or not.
But if you’re not in an ETL job, for example, if you are working in even-based automation, or just a simple ESB for example, for the oldest. I think that’s where the pain starts coming. Because when you’re doing something like that, when you are in an event-based orchestration, you need to have something that is flowing. It’s not like, at this time, you will have a job that works or that doesn’t work. You have some events that will be triggered that work, and some that don’t work. And it doesn’t mean that there was something wrong, it can be that, or the event itself was not proper. It can be because of the data quality, you have to work and understand what is going on. They’re most like for the energy industries doing for the electricity, for example. It’s always flowing and if there is no more energy, there is something wrong. You need to deep dive and understand what is going on.
The observability, when you’re in this kind of integration method, can be a bit hellish. You need to have good tools, the right tools to handle that. And for SMBs, I still link that to the first challenge, which is the resources, which can be tricky. I think that there are the two most important subjects are scalability and observability. Availability is also a huge subject, but it’s linked to scalability. When you want to be scalable, you need to be available because with the cost of the resources it’s easier to put more servers that have an increase of the power of the servers. So if you are scalable, you are highly available.
And I think that’s it. It should have more, but if you want, you can deep dive that subject offline for sure.
Monica Holland: It was very interesting and I understand that Marjory is actually focusing specifically on the run, making it really robust as a solution. Before we go in one sentence if you can, where do you see industry evolving over the next five to ten years in workflow into data integration and automation?
William: I think that the industry of data integration industry will really be changed by AI. But I think that we are at the start of IA. It’s really complicated to tell you what will be doing in 10 years. I think that it will be to AI. I’m 80% sure about that because that’s the most impressive innovation that we have had.
Monika: Let’s finish up on that. Thank you so much to everyone for joining us today. Thank you, William, it was fascinating to have a piece of your mind on the subject and that is such a hot topic for all of us. Thank you so much. We’ll see you next time.