Open Source Neural Networks in 500 Lines or Less with Marina Samuel
The fourth book in the Architecture of Open Source Applications series is called “500 Lines or Less.” The book focuses on the design decisions that developers make in the small when they are building something new. Marina Samuel, Staff Software Engineer at Mozilla, is one of the authors featured in the book, for which she wrote a 500-line simple neural network for OCR. I spoke with Marina about her early career at Mozilla, her work on the Firefox Browser, notably on privacy initiatives, and about how to get involved as a first time contributor to open source.
Transcript
Ledge: Marina, it’s really cool to have you. Thanks for joining us.
Marina: Thanks for inviting me, David. It’s so exciting to be here.
Ledge: That’s awesome. It’s great to have you. Would you do a two- to three-minute kind of a short story about yourself and your experience and your current work with Mozilla.
Marina: I’m somewhat of a recent grad ─ like a little bit over five years ago. I went to the University of Waterloo for undergrad software engineering which was a really cool program because I got to do lots of co-ops there.
I tried a few different companies and spent some time in California and then came back home here in Toronto to work at Mozilla. I’ve been here for about five years full time. I left for a bit to do a masters which Mozilla was very much in support of for me to continue my education. And then, I came back and continued and used some of the stuff that I learned throughout my masters.
I’ve been on a variety of different projects at Mozilla ranging from working directly in Firefox, writing JavaScript and C++ code to working on data analyses and figuring out “Do you people like a feature in Firefox? Why or why not?” up until where I am now doing more Python backend work, still on the data side of things helping people do their own analyses through querying data sources and so on.
So that’s where I am right now.
Ledge: And Mozilla being one of the most important names in open source and Internet technology, everybody gets the “wow, you work at probably one of the coolest, most important places.”
What is it like across all the project teams that you described before we were on the air about the distributed work and engineering teams all over the world working on different things?
Talk about that experience and all the different projects and how they get managed and come together.
Marina: Mozilla is very distributed all over the world and there are a lot of offices all over the world but there are also a lot of remotees. In fact, my manager is in California and I’m here in Toronto. Nobody on my team is with me in Toronto.
We do are regular meetups throughout the year. The whole company meets twice a year in some location that they choose in advance; and it helps people to connect. When you say, “Okay, that’s the IRC handle that I’m used to seeing. Oh, that’s the face” and you match it up; and it makes work connections a lot better.
There are also smaller meetups. Individual teams might meet up in a location where most of the team members might be. For example, recently, my team met up in Berlin and had a really productive work week there.
Ledge: What’s your favorite Mozilla project? I imagine there’s a lot of internal discussion. You, guys, have hundreds of things going on. I’m just curious. What’s the cutting edge of the project everybody wants to be on or some of the big things that you hear about?
Marina: There’s so much going on. I think in terms of cutting edge ─ and especially because in my masters, I was specializing in machine learning and data analysis, right now, Mozilla has an AI team; we also have a VR team. They’re kind of “researchy” teams so we don’t have big products out there that people are hearing about yet.
The AI team is really interesting and doing research in text to speech and speech to text and using deep neural networks. Everybody is super excited about neural networks these days.
I would say that that’s the cutting edge stuff that lots of people are really excited to learn more about and contribute to that team and so on.
Ledge: We reached out to you because of your authorship on a particular work. I would love for you to talk about that ─ the group of people who put that together and the experience of “What is that book? How does everybody get it? Why is it important?”
Just talk about that a little bit because I just thought that was a really great project.
Marina: Sure. I believe the book is called “500 Lines or Less” and it’s part of a series of software architecture-type books. I’m not super familiar with the guy who started it but I think he was a U of T professor and he also worked at Mozilla for some time. He was just reaching out to people to find people who would like to contribute to the book to help other learn about a variety of topics.
I was intrigued by this topic ─ 500 Lines or Less. The topic I chose was a simple neural network that can do OCR and I liked it because a lot of people think in their head like, machine learning sounds complicated; AI sounds complicated ─ where do I get started?
So for that type of person, that was what the chapter that I wrote was tailored to. It’s like, “Here, look! It’s actually not that hard to get started. You can do it in under 500 lines. If you’ve ever written some code before, you can totally get started on this.”
That was my personal motivation to write that and it’s almost like decrypting how people might perceive AI or machine learning.
Ledge: You have had an awesome experience in the first short years of your career. What would you tell students and junior developers now who are looking to get it to open source and look into working on cool projects?
I think everybody hopes to make a leap and get into the great stuff that you’ve been doing. What’s your advice for that set of people?
Marina: Something really important is that there’s no place that’s too small to start. So from the perspective of someone who works at an open source company, we always want contributors. Even if all they’re doing is changing a line of documentation in the code, that’s super exciting.
Yes, please come join us.
If you’re just browsing a code base and you don’t know where to get started to contribute to open source totally, the smallest thing is super appreciated.
And then, in terms of more general career advice, I would say that it’s really good to look for opportunities and take opportunities wherever you can get them. It’s really good to speak to people around you about their experiences as well.
For example, I love where I stand in my career now and I love what I’m doing right now but I’d also like to learn what other people around me are doing and learn what they like or don’t like because maybe they’ll tell me something that is enlightening to me.
I think high-level networking is very important and learning from others through that networking.
I like to personally be a “yes” person. If you propose some idea to me, my first reaction will be “Yes, I want to try it. Yes, I’ll see what I can do to do that.”
I think having that attitude opens up a lot of opportunities. That’s the advice I would give to new grads who are getting into this.
Ledge: Fantastic! I think you’re right about open source. It can be a little intimidating to sort of get started. There’s this broad universe of repos and projects and you’re kind of like, how can I make a difference?
It’s good advice to be able to say, “Hey, just try something.” It’s like making your first edit on Wikepedia. You kind of don’t feel important but that contribution does matter; and then, you can do a little bit more later and then put in a PR. It makes a lot of sense.
Marina: I have one more comment on that. At Mozilla, we try to do this. We try to tag issues as good first bugs. So if a contributor rolls in and they’re like, “Oh, where do I start?” there’s a tag to look for so it’s easier for them.
I’m sure that other open source places that want to bring contributors will do something similar as well. So keep an eye out for those. That’s another piece of advice.
Ledge: What’s up with the new Firefox? Should we all be excited?
Marina: Yes, totally! It’s faster and I think the marketing is “as fast as Chrome or faster.” It’s a really competitive world out there.
My honest perspective is that the truth is, it’s faster in some aspects and not as fast in other aspects. I don’t have the exact metrics with me right now. But I would recommend to people to go and try it and see what you think of it because it depends on how you use the browser and what you use it for. So it’s really up to the person.
And I think it’s not just about the speed because that’s a big deal to people. That is a big reason people are saying, “I choose Chrome over Firefox” or whatever they’re choosing.
But another thing that I think is so important at Mozilla and in Firefox is privacy. I think it’s also kind of cultural.
From what I’m hearing, privacy is a bigger cultural thing in Europe than it is in North America. Maybe in recent times with a lot of the stuff that’s been going on, people might start to appreciate privacy more.
The one thing I want to say as an insider at Mozilla is that we’re not joking and exaggerating when we say we care about privacy. It is so important that projects will stall if there is any sense of privacy breaches to a user.
For example, I was on a team that was looking to do content recommendations to users and the best way to do that is server side. That’s how it’s been done. That’s how everybody does it.
But there are privacy risks with doing stuff on the server. You have to make sure it’s all encrypted well. If the government wants to see that data, you have to give it up and all those stuff.
We found a way to do some of the content recommendation on the client side, and this was something that Mozilla spent a lot of time on because we wanted to make sure that we’re not making our users vulnerable in any way.
I just want to say, from an insider perspective, that I think that’s a big deal; and for people who care about privacy, they should totally go for Firefox. Of course, I’m biased.
Ledge: That’s great. That’s good to hear ─ the passion. I love to hear people get excited about the inside baseball of their product. We feel the same way about what we’re doing.
It’s like “You want to hire the best engineers. That’s what we do and we’re very excited about it.”
Last question: I ask everybody this.
How do you evaluate if someone is a senior excellent A+ elite engineer? What are your heuristics for what makes a great software engineer?
Marina: That’s such an interesting question. I feel like you can tell by their passion but also by looking at the work they do. How meticulous are they? Do they think about edge cases?
I wish that the standard interview process for software engineering was a little different. I’ve seen some more recent setups that I like more where you give someone a real world problem and you see how they solve it. You see what their code looks like.
Is it easy to read? As I’ve mentioned, are they thinking of edge cases? How do they review other people’s code?
Being respectful and being open to other ideas ─ actually, I can’t stress that enough. If an engineer is being close-minded to another engineer’s idea, you could see it as a cultural thing but I feel that you’re not going to make progress that way. You have to take every idea in and break it down and, together, agree on the pros and cons of that idea and agree on a goal.
And so, just having that ability to be open-minded is also really important. At the end of the day, I feel like I can usually tell instinctively if an engineer is very competent. Just by interacting with many in the past, you see someone who has really great output and what their characteristics are.
Now, I can look for those characteristics in other people.
Ledge: You make a great point that you learn over some period of project-based or collaborative work how excellent someone is which can immediately tell you how difficult it is or show you how difficult it is to do hiring and evaluation in a short timeline ─ three interviews or a project or whatever it is. Most of this stuff comes out over months and months of work.
The way we suggest that people sort of address that challenge is through freelance and contract work because you can step up and continue to do more when you’re comfortable. And I think that it’s a really good buying proposition to work with contractors and freelancers and get to know somebody and not be locked in and have that terror of “I hired the wrong person and now I need to fire them.”
Awesome! So cool to have you on. I really appreciate your contributions and, of course, those of Mozilla.
It’s nice to have you on the show today.
Marina: Likewise! It was great chatting with you. Thank you so much for having me.