October 12, 2021

Backblaze + Cycle.io

In this episode, Jake Warner chats with Elton Carneiro, Director of Partnerships at Backblaze Cloud Storage & Backup. Discussions include the Cycle + Backblaze partnership, integration, and coming together to solve developer-centric needs.

Transcript

Jake Warner + Elton Carneiro

Jake Warner: [00:00:14] Hello and welcome to the eighth episode of our podcast. Today we have Elton Carneiro from Backblaze. And we have a lot of really exciting things to talk about today. I've been excited for. I've been looking forward to filming this episode for a while now. And I have we've, we've, we've been in contact for well, we've been working with Backblaze for a number of years now, but I have started talking recently and it's not even recently. It's been months now. I don't know. Time is a bit of a blur, but, you know, just really excited to be able to do this podcast with Backblaze and have Elton here. And so I guess starting off, Elton, it'd be great to hear what you do have Backblaze maybe an introduction for people who have not heard about Backblaze before and just. Yeah.

Elton Carneiro : [00:00:58] Yeah. Thanks Jake, and thanks for the invite. You know, really excited to be doing this podcast with you and you know, we'll dive into the details more. But you know, I was super impressed when I came across cycle and I'm still super impressed with all the things you guys have accomplished. But a little bit about me. My name is Elton Carneiro. I head up the partnership team here at Backblaze. My role covers all our technology partners, Alliance partners channel, as well as our MSP programs. And for you guys who have not heard of Backblaze, we are a platform storage platform in the cloud that allows customers to store any kinds of data in the cloud. We offer two services, computer backup service, as well as our B2 cloud storage service. Computer backup service is offered at $7 per device per month. Backup as much as you want. But the platform that we built on top of our computer backup service backend infrastructure is our B2 cloud storage, and that's available for you at half a penny a gig, $5 per terabyte store, any kinds of data you wish. And you know, we've been doing this for a long time. The company was founded in 2007 with computer backup service. We've evolved, we've grown. We have over close to two exabytes of data under management and a half a million customers globally. And we do what we do best in store data.

Jake Warner: [00:02:19] So no. Excellent. And so you said two exabytes of data. Do you have any rough idea like how many drives that is or how many servers or anything that helps put that in perspective to people who might not realize just how much data to exabytes are?

Elton Carneiro : [00:02:35] Totally. Yeah. So if you haven't had a chance, I highly encourage you to look at our blog. We have a quarterly published blog that we put out every quarter around our hard drive stats, and we we have close to 200,000 spinning hard drives in our data centers across multiple regions globally each week, a number of different hard drives, different sizes, different formats, different companies. We keep, you know, we basically spread our business across all the large hard drive manufacturers, but close to 200,000. And, you know, we also introduced in our latest edition of the Hard Drive Stats blog, How SSDs play a role in our infrastructure? So highly encourage you to if you haven't had a chance? Take a look. If you're out looking for a hard drives to purchase object for whatever reason. We give you all the annualized failure rates of those hard drives. How are they performing? Which models are performing better than others? So it gives you an understanding of where you should put your money in terms of buying storage.

Jake Warner: [00:03:38] And as you mentioned that I remember one of the first times I ever came across Backblaze. This was before we had formed before our company started working together back and I think we've been working together since like early 2019, maybe even maybe even late 2018. But one of the first things I came across with Backblaze was one of those blogs that you put out about the hard drive, how many hard drives you had, and the failure rate by a different manufacturer and things like that. And I was like, Man, that's actually really interesting because that's the most data you could sell back to the manufacturers as like, like how reliable is your stuff? And because, I mean, obviously if you have 200,000 drives that you are working with, I mean, that's definitely at scale. So you actually have real numbers to indicate which are better than others or at least more reliable maybe is the better term.

Elton Carneiro : [00:04:30] Yeah, Yeah, totally. And so the reason for us to do this is more to give back to the community, right? We understand that nearly every person out there on some form of a hard drive, whether it's in their laptop, in their desktop and a lot of, you know, consumers also prosumers go out and buy NAS devices where they need to go and buy hard drives that fit into massive. You know, we understand that, you know, like anything else, some things will fail better more than others. Some things are built more durable than others, which which first, it's just about giving back to the community and let the community know, you know, what's going on in our data centers, how we see the things that we have purchased performing our data centers. And that goes back to kind of the demand. And philosophy of bad business is going back to the transparency factor and just making sure that, you know, we talk about what we do openly in our blog. So other than the hard drive stats, I'm, you know, of our viewers out there who want to learn more about Backblaze. You know, lots and lots of different stories, articles on, you know, the company where we came from, where we're going. Some of the cool stuff we've done as well as what you know, what our thoughts are on the various different industry aspects that know come up and coming.

Jake Warner: [00:05:44] ExcEltont. And I do have one last question on on the drive thing. This is this is me nerding out for a moment. So I remember when I first created an account with Backblaze when I was just playing around with it years ago, you had a thing called and I might be remembering wrong, wrong, but I believe it was called a fireball or something like that. And it was this. Well, I guess maybe. I mean, sorry, you do a much better job of explaining it. But my my recollection of it was that it was this this storage device that I could rent from. You fill up with data. So that way I didn't have to to send potentially terabytes of data over the Internet and then physically mail it back to you. What's that about? Obviously, I've never used it, but it's amazing that some companies have that much data that they that's important to them.

Elton Carneiro : [00:06:29] Totally. Yeah. And you're totally right. So the fireball service, it's a rapid ingest service that we offer to the state. Right? We've actually increased the capacity of our each fireball and how much how much data a customer can store on it. But the idea is just that you rent this device from us and we ship it to you. You basically connect it to your network, copy all the data you want to copy over to Backblaze, ship it back to us. We'll load it up using our fast connectivity into our cloud and store it in the bucket of your choice. You know, the whole premise around this is we totally get that. You can only send so much data to the Internet connection that you have, right? To put things into perspective, a one gigabit per second, if you were to transfer data 24 hours a day, you'd be getting about ten terabytes in terms of data transfer per day. And our fireball, we have a capacity of 100 gigs of 96 gigs, you know, more or less. And you know that over the Internet it's going to take you about 9 to 10 days to transfer the data with the fireball service. You can plug it in and network use any solution you want to copy data. And we have integrations with all kinds of different, you know, technologies out there that if you're using them on premise, otherwise you've been Windows Copy or, you know, Linux copy command, you could copy the data which the fireball and you know, you send it back to us and your data goes into a bucket and voila.

Jake Warner: [00:07:53] I mean, just again, it's just so neat. Like it reminds me of, I'm guessing you being in data storage, you've seen the there's there's a Wikipedia article about it. And I believe it's it's I believe it's like IP over carrier pigeon or something like that. Where have you, have you seen have you come across that where it's like it's like someone literally was sending a flash drive, you know, attached to a carrier pigeon to transmit data. And it was amazing because like, it's a super high latency and then suddenly when you get the data, it's this huge subset of it. And it's interesting, just like it's like, you know, to think back and I know that with Internet speeds faster now today it's, it's less so important. But like, you know, back when this initial like a test was done, you know, you had a max Internet speed of maybe ten megabits. And so the fact that someone could attach a flash drive to a bird and just send it and you had more throughput via literally a physical bird was actually really exciting and kind of funny.

Elton Carneiro : [00:08:54] As Yeah, but the the term I used to use a lot is snicker net. Yeah. Basically someone filling a backpack for those hard drives and running across or, you know, traveling, whichever way of commuting they want to choose to drive across from one place to another and and moving data faster That way we've evolved you know, we have, you know, lots of people who have gigabit connections going into their home, you know, multiple ten gigs, going into data centers, etc.. You know, we have tons of bandwidth coming into our data center, like we have over 800 gigabits per second coming to our data center. And so we can take in at any point in time, you know, if we do the math around that right, ten gigabits can push 100 terabytes a day. 100 gigabits can push a petabyte a day. So if you calculate 100 gigabits times each, we can take in, you know, likelihoods of 8 to 9 petabytes of data at any point in time.

Jake Warner: [00:09:54] That is insane. Insane. But but it helps also bridge these conversations. Right? Because like, as we as we as we cycle, we're looking for a partner for for storage because like, like I've already mentioned during this call and for anyone who's during this, this, this, this podcast and for anyone who's not familiar. We did partner with Backblaze back in I don't remember exactly when it was I think it was late 2018, maybe early 2019. I know times a bit of a blur, but we but we integrated with with Backblazebecause we wanted to be able to store our base container images. Right. And that was a perspective that we took those maybe different than a lot of other container companies in the space is a lot of other companies said, hey, we're going to integrate with Docker Hub or some other registry. But the idea was that you would still store the main container images yourself, and any time that container platform would need that image, it would pull it from your data source. And our thought was, well, what happens if someday, if Docker Hub changes the pricing model, which it ended up doing, or what if they limit how many images you can pull, which they end up doing? Like there are so many things that we tried to eliminate those risks to us that we said, Hey, when you pull in an image, we are going to actually store a copy of that image ourselves.

Jake Warner: [00:11:05] And as a company we decided we need to find a reliable, solid object storage provider that allows us to store an infinite amount of data without us having to be responsible for deploying new servers or installing some open source software to help manage a huge deployment of storage. So obviously Backblaze was the company that we chose and it was really neat because again, as you're talking about the capabilities of Backblaze today and how much data you can store and how much how much the bandwidth that can come in and out is just insane because like when we were going through that initial testing, we tried a couple of different providers before we ended up saying, Hey, Backblaze is the answer here. And it was crazy because like some of these providers like, Hey, we had good download speeds, but the upload speeds were terrible, or maybe the opposite and or sometimes it was unreliable. And so with Backblaze it was like that. Perfect, that perfect, I guess that perfect spot or maybe the criteria that we needed that allowed us to solve that problem. And so we started talking what I think, three or four months ago now. All right. Let's let's let's let's continue building this partnership on and beyond just being able to store these base container images.

Jake Warner: [00:12:17] We wanted to be able to store backups and things, which was which was a feature that we had requested from a number of or so. We wanted to store backups that our users were generating as our users. That was a highly requested feature that a number of our customers were asking for. Right? And so I don't want to get into backups yet, but because I think there's a lot of fun stuff to dive into. But specifically, I want to talk about why it made a lot of sense for our two companies just to not only have we work together before, but to make that partnership more to to increase the depth of it and work together in so many other ways. And I know that backups is the start of it. And there's some other things we have planned for down the road. But I think the thing that resonated really well with us is we both realized that at the end of the day we are best. Maybe best is the wrong word, but we we understand developers, we work with developers. And it would be it'd be great to hear about Backblaze experience with with selling to developers and working with developers. And just simply that.

Elton Carneiro : [00:13:23] Yeah, totally. And it's interesting because, you know, the thing that you just, you know, that you integrated Backblaze for and you know, recently at least is going back to how we built our business and the fundamental use case on what we built our business, which is backup, right? We allowed customers to backup to data to the cloud, have a copy in the cloud. So if anything would have happened, they could easily recover, restore from where they left off. You know, to segue into the kind of audience that, you know, a cycle is is focused on as well as also what we are focused on today is the developer segment. We started seeing a lot of our customers come to us and, you know, say, Hey, we are building in the cloud, we need reliable cloud storage so we can continue building the cloud and we need to make sure that we can serve our entire business from the cloud we don't have on prem infrastructure. Everything is running in the cloud and we the storage platform and why this happen when this happened, if we go back to last year, we launched the S3 compatible APIs. So what this allowed us to do is this is allowed us to offer our public cloud storage using the same API that our developer customers were accustomed to when they tried developing on Amazon or on Google, the clouds out there. And by doing so, you know, we made it super simple for developers to be able to make that switch from using those clouds to using Backblaze or B2 for, you know, for the storage needs.

Elton Carneiro : [00:14:58] And we set it basically getting a lot of the customers who had built in Amazon and built in other clouds starting to move to Backblaze B2 and using us and you know, the use cases mostly around how do I serve up content efficiently, How do I serve our. Dna efficiently, not only store it efficiently, but also be able to retrieve it and give it back to people who need to consume that data in various different forms. And, you know, we've built an ecosystem around our storage platform. We focus on storage as what we do. We love doing storage, but we've built an ecosystem of partners around what we've built to allow our customers to leverage the features, functionalities of building in the cloud, whether it's a CDN, an ability to run compute, whether they're looking to do things like image resizing or video transcoding. This can all be done in the cloud today, and we're not limited to having to do that. You know, a CapEx purchase of licenses, software, infrastructure and this can all be done in the cloud. And you know, we're seeing a lot more developers starting to build in the cloud store, in the cloud, consuming the cloud versus having to worry about buying on premise infrastructure.

Jake Warner: [00:16:09] Have you so with the developers, the developer segment, as you referenced that you've been working with, have you found any really niche kind of verticals that your customer you've seen most of your customers fitted to? I mean, obviously, you know, Backblaze does a lot with backups, but did you see that Like you have a lot of companies coming to you for storing videos, or is there a specific niche that you've you've just encountered more than another?

Elton Carneiro : [00:16:36] Yeah, totally. And so one of the biggest niches is being basically around delivery, right? So content delivery, whether it's, you know, a company that's building out a photo library for customers, being able to deliver photos over the web, whether it's a company building a video. So we have a case study with a with a company called Canopy. So Canopy is like the Netflix, but for libraries, etc., where they basically host a whole bunch of content that isn't Netflix audience, but it's more for education purposes and so on. And you know, they're built on Amazon and they had decided to move to backhoes B to because we were able to offer them the storage button that they needed for to be able to deliver to their customers, attach to it a CD network that allowed them to basically get that optimal edge experience. It's really low latency to the edge experience and be able to allow their consumers to consume the content that they have sitting in that place on YouTube. So lots of different use cases, but definitely the video image of workflows, being able to deliver that as a niche as well as, you know, think of anyone out there that's that's building any solution, any product out there, whether it's like time back to, you know, Jake, what you're doing with cycle to back up your cloud infrastructure, cloud containers to Backblaze to being able to to store data with whatever the data may be. Just, you know, you need to store it in the cloud. You know, the different APIs exist today. One of the most common ones is the S3 API in terms of interfacing with cloud storage and and we offer that. So definitely, you know, again, endless and there's different workflows, but feel free to come reach out, you know, reach out to us and we're happy to engage with you and talk to you or how it's possible to use that voice.

Jake Warner: [00:18:27] And I guess with that last note I saw on your website, I think it was last week that you had a you had I don't want to call it promotion, but you had some sort of announcement up where you were talking about how if people were hosting over three or I think it was S3 and was it was it Google was the other one, how you were saying, hey, if you're hosting there today, reach out. And there's we have we have we have a sub, we have a set of tools and things like that that help move those companies over to Backblaze. If you want to dive in. Yeah.

Elton Carneiro : [00:19:01] Yeah, totally. So, you know, going back to what I mentioned, you know, last year we launched the S3 APIs, right? And we started seeing all these customers who had built in Amazon. You know, one of the biggest objections that we faced when we started, you know, target and customer, that we wanted to kind of save money, save costs, but also, you know, look at leaving Amazon for whatever reason, the biggest objection is the egress fees in Amazon or other clouds. Right. You know, the the term being used, it's like, you know, these clouds are like Hotel California. You know, once you check in, you can never check out. Right. And and they make it really expensive for customers to check out and move the data out of those clouds. So we got crafty. We put together a solution, a service actually, that a customer pays $0 for if they have more than ten terabytes in the on Amazon, Google name your cloud out there will migrate that data for you at no cost to the customer. And we take care of the increased fees. We take care of the migration fees.

Elton Carneiro : [00:20:02] It's pretty much like a white glove service that you get to benefit as a customer and your data basically ends up in Backblaze be too free to continue consuming. And the good part about it is. All you do is switch an input. And with technologies like CDNs, you can get really smart by putting in rules into CDN that says, Hey, for the first X amount of time while the migration is happening, you know, make sure you, you know, the query goes to first your original origin origin store. If you get back a 404, redirect to Backblaze B2. And after that amount of time, when you know majority of your content is sitting in back with V two, you can have the request for to back this up too. And if you get a44, go back to the other origin store so you can actually still remain in production, still remain serving your customers. And from a customer standpoint, nothing is going on, but you're actually moving entire storage from one platform and one cloud to another cloud.

Jake Warner: [00:21:02] I mean, excllent,excllent, excellent. And so on that same note, like, do you like do you have any idea what the largest migration you've performed yet is?

Elton Carneiro : [00:21:14] I would say it's over a petabyte.

Jake Warner: [00:21:17] And do you have any idea how long that took the to to migrate?

Elton Carneiro : [00:21:20] So we we as I mentioned, we can do anywhere from 200 to 300 terabytes a day out of any of the public clouds. We can we go faster probably. But at the same time. Right. We have realized that with a sweet spot in terms of moving content out whereby we're bound to be, you know, hit less issues overall, obviously, the faster you go, the more issues get in. But, you know, we've we've been able to push the envelope to around 200 to 300 terabytes a day. We've with more like been shy to go faster. Not that we can't go faster. It's just for us that those speeds are perfectly fine for a customer wanting to leave. The fact that we can do it that fast is just super impressive.

Jake Warner: [00:22:07] No, it absolutely is. It's just I mean, like, it's hard to comprehend how much data that is. Like if you actually think about it like, and you realize like just copying even a terabyte how much that is, and then when you realize that now you're doing that exponentially is just insane.

Elton Carneiro : [00:22:25] So, so the complexity is not only around, you know, how much data, there's also complexity and how many objects you migrated, right? It's, you know, you could have, you know, a terabyte file and then ten of them makes ten terabytes great moving ten objects is not the challenge. But if you have ten terabytes that is basically composed of millions and millions of objects, the complexity there is far you know, it's far higher than than moving ten objects at up terabyte each. Right. So its size and amount of data and amount of files that definitely makes it interesting, makes it challenging for some of the migrations that we've done. And we've done something like 300 and 400 million objects at a time. That's insane.

Jake Warner: [00:23:16] That's insane. And so, you know, again, as we talked about developers and things, you know, it's really neat just being able to to work with another company again, because with cycle we 90, 95, maybe 99% of our audience is developers. That's who we work with. And I think, well, not even I think, but compared to most companies in our space, we're like, if you're a container orchestration platform, most of the time you're working with more DevOps than developers cycles whole. The whole goal and focus is to allow developers to have the capabilities of a DevOps team without necessarily needing a DevOps team to be in the middle. So it's really neat because we both cycle and backwards are able to have that relationship directly with those developers and in a way that we can empower them, do a lot of things right. And so, so again, so I mean, obviously the partnership makes complete sense between both of our companies. But as we as we dive into what is the next step of this partnership look like? And obviously Backblaze has done backups for a long time now. Now it's neat to be able to offer that functionality to cycle. And one of the ways that we've done it is actually really neat because I mean, obviously, you know, between our two companies, we had a conversation last week where we had a conversation last week where it was, how can we So I guess maybe let me talk about the problem first.

Jake Warner: [00:24:42] The problem is if if a user has a container and that container has a volume attached to it and that volume has maybe nine gigabytes of storage used, but that the volume sorry, the volume has is a ten gigabyte volume in total, but it's nine gigabytes use. So there's only one gigabyte free. We can't compress that nine gigabytes down to fit in that extra gig, you know, to allow us to build a full backup. So the question that we had last week when in the Backblaze channel is how can we stream this data? In a way that allows us to I mean, this streaming was was something that we had already built in, but we didn't realize the issue was if we don't know the total size of that file beforehand, you know, the API that is the Backblaze API expected us to know that file size ahead of time. And when we did when we didn't even know ourselves, it made it a lot more tricky. And so and the reasoning behind that is obviously purely, I mean. Obviously for Backblaze, it makes a lot of sense because as you are getting ready to accept a file to put on your different pods and things, you need to know what is the total size that a user wants to store so that you can properly put that where it needs to be.

Jake Warner: [00:25:56] In our case, we said, Hey, we don't want to store anything on our infrastructure larger than a small, small subset, because if a user doesn't have that much extra space sitting around on their server to do a backup, we don't want to fill up their server with a backup and prevent that download from actually happening, right, the upload from actually happening. And so so that was kind of the context of the problem. And then so the solution ended up being that we allowed were the functionality so that as we start building that backup, we write the first 50 megabytes to disk, and then from that point we can say, all right, we know that this is either going to be larger than 50 megabytes or it's 50, 50 megabytes or less, and that allows us to use the proper API. Call it backwards to figure out where how we should upload that data. Right. And but it's really neat because the neat thing about Backblaze API integration is that if you if you need to upload a large file, obviously you can use the, the, the large I think it's just called large file.

Elton Carneiro : [00:26:54] Yeah.

Jake Warner: [00:26:54] Yeah. You can use the large file upload and, but the neat thing about that is you only have to know the size of the part, not the entire file. And so we, we built this kind of buffer process where we would only buffer 50 megabytes at a time and then we'd say, Hey, is this is this a full part? And then upload it and we automatically increase it along the way. So it's really neat because the the, the how that works for our users now is if they have whether they have two megabytes of data and maybe it's a brand new database that they just want to start backing up from day one. Or maybe it's a customer that has ten terabytes of videos that they want to backup cycle. We'll we'll do those backups with only using 50 megabytes of data at a time. And so it's just really cool because we're not we're not we're not taking someone's valuable disk space where they're running their actual applications and using a big portion of that for those backups, which is really cool. So we we obviously we have we've done we've gone through a lot of testing with that. And the fact that it's working really well and everything is just really, really cool with being able to do that and totally.

Elton Carneiro : [00:27:59] And the key part, which I feel that I think you should highlight, is that you're able to solve this by looking at our documentation and reading that, because I don't think I responded quick enough before you solve that problem. So so our documentation definitely helps guide that. But the difference is there is a difference between our file upload versus our large file upload, the large file upload. You know, all you do is make sure you specify the part size and as long as all parts except for the last one of the same, you won't get any issues. The last one, the last part is allowed to be smaller than the other parts because that's the way the large file upload works in that sense. But yeah, it definitely, you know, a lot of, you know, our developer partners as all developers start looking at our APIs and the same thing exists on Amazon S3 API whereby if, you know, depending on the size of the file you want to put, there's two ways to do it. You can just do it, upload, file, you know, route or the large file upload, the large file uploads allows you to break it down into multiple different parts and then you basically you, you concatenate all the parts at the end to form that large object. So it's a simple import.

Jake Warner: [00:29:12] Well, and the other thing that I like about it is the fact that with your API, I mean I'm sure other people support this as well. I haven't looked extensively, but I like that you're with, with the Backblaze API that you can specifically say, hey, I will append an SSH hash or sorry, not an SSH, a SSH 1 hash at the end of the segment that I'm sending you that way, instead of me having to know that ahead of time I can build a streamable format. And then at the very end of that, each individual part, I can attach that hash then. And so that's how this ended up working really well is because I don't have to like. So I wrote a buffer that as long as I'm reading in data, I am building that hash as I go. So it's not like I have to build this part, go back, read it again and calculate the hash at that time, it's literally just using like. So I don't know if we've talked about this extensively, but 100% of cycles back end is written in Golang. And so within Golang like you have, you kind of have this like inherent kind of push to build everything in a streamable format where you're working with it's either four kilobyte blocks or 64 kilobyte blocks. I don't remember which, but the idea is that most of those most of those functions are built kind of inherently with that approach. So so the fact that we were able to build that approach into Backblaze as well is really, really neat. And again, it allows our users to be able to backup extremely large amounts of data without using more of that disk space and then have those backups be on a reliable object storage provider like Backblaze is. Is huge.

Elton Carneiro : [00:30:43] Yeah, I don't know, Jake. I think you basically laid the foundation of a great blog post. So looking forward to.

Jake Warner: [00:30:51] Absolutely. But the other thing that I want to talk about, because, you know, we're talking about this backup functionality, the day of this podcast goes live. That backup functionality will also be going live. And so. So when. It's simply that so when used to be able to when if you're watching this right now the backend functionality is live. Feel free to give it a test. If you encounter any issues, please don't hesitate to reach out. But there's one other thing that I'd like to talk about with the backups that is really neat and how we implemented this as opposed to as opposed to some other strategies that we've seen other companies do. And as we go back to kind of that Streamable approach that I was just talking about, we there's a lot of backup solutions that say like, Hey, you just point me at a directory and I'll go back it up for you. Right. But that requires extra disk space because like if you have like let's say you're working with a MySQL database or a cluster and you do like my SQL dump, do you really want to dump all that data to a directory just to back it up, only to delete that that directory, Right. And so the approach that cycle took is we don't look for file path, we don't look for a directory, we just capture the standard output.

Jake Warner: [00:32:11] Right? So the idea is that you can run any command you want and as long as you output it back to the console during that backup command, we will capture it in 64 kilobyte blocks as it comes back to the console and then output it to and then save that as your backup. Right. So that means like suppose you wanted to backup database. It's easy as my SQL MySQL dump command and just output it on the console. We'll go and save that as your backup. You're not writing to disk. You're not writing to anything other than a socket that is going to go directly to Backblaze. But the new thing is the the the the other neat thing there is like if someone says hit well, I do want to backup files. How can I do that? You could use tar and then just pipe the output from tar back to the standard output. Right. So it's really cool because you could still back up those directories. You're just going to wrap it into a Tar Command, but you can also Tar GZ it along the way. So you're streaming that output to the console at the same process.

Elton Carneiro : [00:33:04] And you're compressing.

Jake Warner: [00:33:05] It and. Exactly. And you're compressing it. So it's just really cool that we've done it that way. And then one of the last things we're working on right now internally before this goes, this goes fully live is the restore. And so what we're trying to figure out how to take it working right now is if I if I go if I from the cycle interface, if I click a backup that I've already done and I want to restore it, I should be able to have another command. But then the standard in. Oh, I know. But the standard input rather, is the reverse socket. So we're just going to pipe that output right back in through the standard input that way. I mean, it's just really neat because it allows people to build like native backups things. Like one of the things that we decided is we were doing this is we don't want to just backup files because like suppose you said, Hey, I have this database, I want to back it up and you just pointing it to literally like var slash, lib, slash MySQL. The issue there is if we start backing up that directory and maybe one of your users or something that is consuming that database changes data in the middle of us backing it up. You might have a backup that has, you know. Literally, you know, different data than you expect it to have because it was created in the middle of a backup. Right. And so it's really neat because now we are allowing our developers to use native backup functionalities like MySQL Dump knows how to lock tables during a backup and things like that. So it's just really cool of just being able to give users that native backup functionality, but abstracting it so easy that as long as they can get it to a standard output and then put it back in via standard input, they can do whatever they want with it. And so.

Elton Carneiro : [00:34:38] That's awesome. So, so that is ideas that you're running a command in your command prompt, MySQL, whatever credentials, dump this table, you know, you're basically using your greater than sign and then you basically have the cycle module that's basically picking up and storing it in Backblaze B2.

Jake Warner: [00:34:55] Exactly. So, so when a container is up and running, it's it's obviously in its own namespaces and stuff, so it's isolated from everything else. What we will do is when you have your command and you can say, Hey, here's my command and you can actually define that command with a cron string as well. So even though we're not using like cron tabs or anything, you can actually use a traditional cron string. So you can say, Hey, I want this to run hourly, I want this to run on the third minute of every hour or however you want to define with your with your cron string. You can you can use that cron string with that command and we will actually, Well Cycle, We'll will jump inside of your container, run that command, pipe the output back out of the container into our compute process, which is just running on all of our customer servers. And then that compute process is what is actually taking care of moving it back to Backblaze. So it makes it super easy because you don't have to install anything extra in a container to be able to use this functionality as long as you have that that MySQL dump or Mongo dump or whatever built into that container, you don't need anything else.

Elton Carneiro : [00:35:56] That just sounds super interesting. And I would say a great approach to solving this problem without using utilizing disk space, which, you know, is not surprising coming from you, Jake, because I am still super impressed with your SSH proxy. Like that is by far the most secure way to secure an environment by not enabling SSH. And so definitely not surprised that you were able to solve it this way. So, you know, great work. And, you know, I will say that this is grounds for a blog post at some point because there's a lot of cool technology that you've done and cool stuff you've done in making this happen.

Jake Warner: [00:36:37] Thank you. Thank you. No, I'll absolutely have to do a blog entry, but because I've actually heard that from even a number of our customers and the developers that are using cycle is it's like, Hey Jake, like we like a lot of the content you're putting out. It's really interesting, but we'd love to have a deep dive into certain of these things where it's literally just nerding out for like a two and a half hour webinar on how certain things are built and why we've taken the decision that we have. Because like again, like, I mean, obviously, you know, you know enough about cycle to know that our approach is like we've only focus on the 80%, right? Like there's a lot of other companies out there that try to solve every single use case, but then you have all this unnecessary complexities and all these other things. So our goal is how can we empower our developers to do what they need to do? But keeping it very simple. And so that's where like when we started building this backup functionality, we had like, hey, you know, you can choose whether you want a directory or you could choose this or you want like we gave like ten different options and then it's like, no, no, no, no, no. Let's give a single command. And if you want to pipe if you want to backup a directory, use Tar, use some of these other things. And that's easy because, I mean, just like you said, like our goal is keeping things simple, opening the door so you can use it. But but I love that you brought up the SSH proxy. That is absolutely my favorite feature in cycle.

Elton Carneiro : [00:37:53] I love it. Like, I like even like again, feel free to correct me if I'm wrong and the way I explain this, explain cycle to the people I talk to. Right. You know, I go back about 15 years ago where if you wanted to build a website, you'd go out there and hire a web developer. They would come back. They build your website based on your parameters, boom, You have a website. Fast forward to today so many different platforms out there that allow you to build a beautiful website. And that's what I look at Cycle as, you know, way back when you're going to be launching your Docker Kubernetes platform, you need a DevOps engineer. Fast forward to the to now in the future. You don't need that, you need cycle and it doesn't all for you.

Jake Warner: [00:38:37] Oh, absolutely. And it's so neat to be able to see like we have we have companies that are moved to cycle from Kubernetes. And I mean, we have we have everything from we have some smaller companies that are just using to host WordPress websites to larger companies that are building AWS Lambda competitors on top of our platform and doing lots of things with AI. And then we have a couple of fintech companies that are doing some really neat things, and then we have some networking companies that are I mean, it's so cool because like we have empowered these companies, to out of all the companies that are using Cycle today, I think only one of them actually has a DevOps engineer working for them, all. Every other customer you've had has realized, like most of them came to us actually saying, Hey, you know, we were getting ready to build a DevOps team, but it looks like, with Cycle, we don't need to. And it's really nice that many of these companies have been on cycle for multi years now and that story has continued to persist that they didn't not they did not need a DevOps engineer to help their development team use Cycle. And it's just, it's just really nice to be able to actually sit back and say like, Hey, we set out with this goal and we actually have the results today to look back and show that we've actually accomplished that is, is phenomenal. And now it's just we need to build more awareness and being able to work with companies like Backblaze to help tell that story is is huge. And I'm so excited for it.

Elton Carneiro : [00:40:00] Yeah, totally, totally. It's seems like you guys like the approach to solving the problem is extremely interesting. And I personally, you know, I enjoy, you know, learning about how the features are built every time I go, Wow, I would have never thought of that, but I'm glad you thought of it. And you solve it that way.

Jake Warner: [00:40:19] No. And I mean, but I think the other thing is like you have to be okay with scrapping something like we started this backup thing three times in the last like four weeks now, and it was like only, like, like early last week that I'm like, No, no, no, no. We mean like, I mean, there's so many times within Cycle that we've done that not just with backups, but with other things where like you set off, like the first goal is like, Hey, we're just going to build this functionality in, and then you start thinking about it and you're like, we can make this easier on our users. And I think that's the big thing about us is with our team, we are we put a lot of focus. I mean, our story at the end of the day is all about simplicity. So if if we if we can make a function or if we can make a feature or I guess functionality simpler for our users, even if that requires us having to rebuild something that we just started, we will absolutely do that because.

Elton Carneiro : [00:41:12] You know, and that resonates with so many developers out there because the number of times, you know, I don't develop much code these days, but the number of times back in the day whereby I wrote a piece of functional code and then I would come back to it a day later and I go, What the hell was I thinking? And it just scrap it and then start again from scratch because you just like, look at it and you go, This is not going to work long term. This is there's so many flaws in it. The way the design or the way you thought about solving the problem. And then through this process, going back to what you said, if you you know, if you if you can recognize that there's a simpler, easier way to do it and are willing to scrap what you did to make it simple and easy, that is approach. And it kind of ties into what we do at Backblaze. We try and make things super simple for our customers, you know, from our user interface, from our APIs and you know, all around as a company is our goal is to, you know, to, to, to be that simple, trusted, affordable cloud storage platform for our customers.

Jake Warner: [00:42:11] But absolutely, I mean, working working with your team has been phenomenal. Not not just not just for obviously the performance of of the B2 service, but within the team itself. And it's always been like anytime we've needed help, the Backblaze team has been there to immediately jump in and say, Hey, what, how can we help? What's, what's, what's the issue? And I mean, that's what you want on a partner at the end of the day is the company that's not just, hey, you know, we we'd like you to help us grow our revenue, but someone who's there to help you get out of the trenches when something's not working the way it is. Right. And so which I guess brings me to kind of the final point here is like, what's next as as our companies work together? And I know obviously you and I have had enough conversations that there's a really big feature outside of backups that will be we'll we'll be working together on that'll likely be a early next year, a feature that we that we launch and like, I don't want to I don't want to give too much of it away, but I think that like we can kind of like. I don't know. I'm like, I'm trying to think like. Like, do I. Do I like, like. Do I drop a hint or do I just save it? Because it's really cool. And I think it's game changing for how containers can consume storage. So it's like.

Elton Carneiro : [00:43:34] It's So what I would say is let's let's let's get our audience to weigh in after this thing, because we could always respond to this and give them hints. If you want if you want to give a hint, I'm okay with that, too.

Jake Warner: [00:43:50] But I think my the hint that I just gave of it will allow containers to consume storage in a very unique way. We'll leave it at that and we'll see if anyone can start to guess what that is.

Elton Carneiro : [00:44:02] There you go. And maybe we'll draw a prize for the winner. Who guesses, right?

Jake Warner: [00:44:06] Excellent. Yeah, that's a great idea. We should absolutely do that. And for anyone else who's been watching along or listening, if you're on Spotify, Apple Podcasts, or I think we're on Google podcasts, too, if you have any questions that have come up where you'd like to learn more about Cycle or Backblaze, feel free to reach out on Twitter or LinkedIn. And we obviously we'll get back to you as soon as we can. And yeah, no, it's been a great chatting with you. I've really enjoyed this conversation.

Elton Carneiro : [00:44:38] Oh, thanks, Jake. It's been a pleasure as well. And, you know, really excited for, you know, what Cycle you guys are doing in the future that builds.

Jake Warner: [00:44:45] Excellent. And I guess before we wrap up, would you like to mention Backblaze is Developer Day?

Elton Carneiro : [00:44:52] Yes. So Backblaze is hosting a developer day on October 21st (2021). It's geared to be more technical focus, a lot of content launching new partnerships as well as, you know, allowing our customers to understand how developers consume storage and the various different use cases that somebody articulated on this in this podcast. But in a more in general, you know, that's the focus. You know, definitely looking at highlighting some of the successes we've seen in terms of, you know, allowing our developers to build in the cloud and as well as, you know, continue, you know, getting feedback and learning from our developer audience to what we can also do to improve and be better quality. So October 21st, you know, it's it's all the information is available on our website today. We put out a blog about it. Feel free to register. And yeah, I look forward to seeing you guys there.

Jake Warner: [00:45:51] Perfect. And we will definitely be there. Our cycle team, but excEltont. Looking forward to it. And again, thank you so much for your time and it's been been great and looking forward to the next time we can do a podcast. When we talk about that next feature, likely early 2022. So great. Thank you so.

Elton Carneiro : [00:46:09] Much. Jake. Yeah, cheers.