'web' Episodes

Boring Old Application Programming Interfaces

     11/4/2019

Welcome to the History of Computing Podcast, where we explore the history of information technology. Because by understanding the past, we’re able to be prepared for the innovations of the future! Todays episode is gonna’ be a bit boring. It’s on APIs. An API is an Application Program Interface this is a set of tools, protocols or routines used for building applications. See boring! Most applications and code today are just a collection of REST endpoints interconnected with fancy development languages. We can pull in a lot of information from other apps and get a lot of code as we call it these days “for free”. It’s hard to imagine a world without APIs. It’s hard to imagine what software would be like if we still had to write memory to a specific register in order to accomplish basic software tasks. Obfuscating these low level tasks is done by providing classes of software to give developers access to common tasks they need to perform. These days, we just take this for granted. But once upon a time, you did have to write all of that code over and over, on PCs, initially in BASIC, PASCAL, or assembly for really high performance tasks. Then along comes Roy Fieldings. He writes the Architectural Styles and Design of Network-based Software Architectures dissertation in 2000. But APIs came out of a need for interaction between apps and devices. Between apps and web services. Between objects and other objects. The concept of the API started long before y2k though. In the 60s, we had libraries in operating systems. But what Subrata Dasgupta referred to as the second age of computer science in the seminal book by the same name began in 1970. And the explosion of computer science as a field in the 70s gave us the rise of Message Oriented Middleware and then Enterprise Application Integration (EAI) becomes the bridge into mainframe systems. This started a weird time. IBM ruled the world, but they were listening to the needs of customers and released MQSeries, to facilitate message queues. I release message queues are boring. Sorry. I’ve always felt like the second age of computer science is split right down the middle. The 1980s brought us into the era of object oriented programming when Alan Kotok and his coworkers from Xerox PARC gave us Smallltalk, the first popular object oriented programming language and began to codify methods and classes. Life was pretty good. This led to a slow adoption across the world of the principals of Alan Kay vis a viz Doug Engelbart vis a viz and Vanever Bush. The message passing and queuing systems were most helpful in very large software projects where there were a lot of procedures or classes you might want to share to reduce the cyclomatic complexity of those projects. Suddenly distributed computing began to be a thing. And while it started in research institutes like PARC and academia, it proliferated into the enterprise throughout the 80s. Enterprise computing is boring. Sorry again. The 90s brought grunge. And I guess this little uninteresting thing called the web. And with the web came JavaScript. It was pretty easy to build an API endpoint, or a programmatic point that you programmed to talk to a site, using a JSP or JavaServer Page helps software developers create dynamically generated pages such as those that respond to a query for information and then pass that query on to a database and provide the response. You could also use PHP, Ruby, ASP, and even NeXT’s Web Objects, the very name of which indicates an Object Oriented Programming language. The maturity of an API development environment led to Service-Oriented Architectures in the early 2000s, where we got into more function-based granularity. Instead of simply writing an endpoint to make data that was in our pages accessible, we would build those endpoints to build pages on and then build contracts for those endpoints that guaranteed that we would not break the functionality other teams needed. Now other teams could treat our code as classes they’d written themselves. APIs had shot into the mainstream. Roy Fielding’s dissertation legitimized APIs and over the next few years entire methodologies for managing teams based on the model began to emerge. Fielding wasn’t just an academic. He would help create the standards for HTTP communication. And suddenly having an API became a feature that helped propel the business. This is where APIs get a bit more interesting. You could transact online. eBay shipped an API in 2000, giving developers the ability to build their own portals. They also released low-code options called widgets that you could just drop into a page and call to produce a tile, or iFrame. The first Amazon APIs shipped in 2002, in an early SOAP iteration, along with with widgets as well. In fact, embedding widgets became much bigger than APIs and iFrames are still common practice today, although I’ve never found a *REAL* developer who liked them. I guess I should add that to my interview questions. The Twitter API, released in 2006, gave other vendors the ability to write their own Twitter app, but also gave us the concept of OAuth, a federated identity. Amazon released their initial APIs that year, making it possible to use their storage and compute clusters and automate the tasks to set them up and tear them down. Additional APIs would come later, giving budding developers the ability to write software and host data in databases, even without building their own big data compute clusters. This too helped open the doors to an explosion of apps and web apps. These days they basically offer everything, including machine learning, as a service, all accessible through an API. The iPhone 3g wasn’t boring. It came along in 2009. All of a sudden; and suddenly the world of mobile app development was unlocked. Foursqure came along at about the same time and opened up their APIs. This really gave the whole concept of using other vendor APIs as a way to accomplish various tasks without having to write all the code to do some of those tasks themselves. From there, more and more vendors began to open APIs and not only could you pull in information but you could also push more information out. And the ability to see settings gives us the ability to change them as well. From the consumer Foursqure to the Enterprise, now we have microservices available to do anything you might want to do. Microservices are applications that get deployed as modular services. Private APIs, or those that are undocumented. Public APIs, or interfaces anyone can access. Partner APIs, or those requiring a key to access. At this point, any data you might want to get into an app, is probably available through an API. Companies connect to their own API to get data, especially for apps. And if a vendor refuses to release their own API, chances are some enterprising young developer will find a way if there’s an actual desire to leverage their data, which is what happened to Instagram. Until they opened up their API at least. And Facebook, who released their API to any developer well over a decade is probably the most villainized in this regard. You see, Facebook allowed a pretty crazy amount of data to be accessible in their API until all of a sudden Cambridge Analytica supposedly stole elections with that data. There’s nothing boring about stealing elections! Whether you think that’s true or not, the fact that Facebook is the largest and most popular social network in the history of the world shines a light when technology currently being used by everyone in the industry is taken advantage of. I’m not sticking up for them or villainizing them; but when I helped to write one of the early Facebook games and I was shown what we now refer to as personally identifiable data, and able to crawl a user to get to their friends to invite them to add our game, and then their friends, it didn’t seem in the least bit strange. We’d done spidery things with other games. Nothing weird here. The world is a better place now that we have OAUth grant types and every other limiter on the planet. Stripe in fact gave any developer access to quickly and easily process financial transactions. And while there were well-entrenched competitors, they took over the market by making the best APIs available. They understood that if you make it easy and enjoyable for developers, they will push for adoption. And cottage industries of apps have sprung up over the years, where apps aggregate data into a single pane of glass from other sources. Tools like Wikipedia embrace this, banks allow Mint and Quickbooks to aggregate and even control finances, while advertising-driven businesses like portals and social networks seem to despise it, understandably. Sometimes they allow it to gain market share and then start to charge a licensing fee when they reach a point where the cost is too big not to, like what happened with Apple using Google Maps until suddenly they started their own mapping services. Apple by the way has never been great about exposing or even documenting their publicly accessible APIs outside of those used in their operating systems, APNs and profile management environment. The network services Apple provides have long been closed off. Today, if you write software, you typically want that software to be what’s known as API-first. API-first software begins with the tasks users want your software to perform. The architecture and design means the front-end or any apps just talk to those backend services and perform as little logic not available through an API as possible. This allows you to issue keys to other vendors and build integrations so those vendors can do everything you would do, and maybe more. Suddenly, anything is possible. Combined with continuous deployment, contiuous testing, continuous design, and continuous research, we heavily reduce the need to build so much, slashing the time it takes to market and the cost it takes to get to market substantially. When I think of what it means to be nimble. No matter how big the team, that’s what I think of. Getting new products and innovations to market shouldn’t be boring. APIs have helped to fulfill some of the best promises of the Information Age, putting an unparalleled amount of information at our fingertips. The original visionary of all of this, Vannevar Bush, would be proud. But I realize that this isn’t the most exciting of topics. So thank you for tuning in to yet another episode of the History of Computing Podcast. We’re so lucky to have you. Have a great day!


Before The Web, There Was Gopher

     10/23/2019

Welcome to the History of Computing Podcast, where we explore the history of information technology. Because understanding the past prepares us for the innovations of the future! Today we’re going to talk about Gopher. Gopher was in some ways a precursor to the world wide web, or more specifically, too http. The University of Minnesota was founded in 1851. It gets cold in Minnesota. Like really cold. And sometimes, it’s dangerous to walk around outside. As the University grew, they needed ways to get students between buildings on campus. So they built tunnels. But that’s not where the name came from. The name actually comes from a political cartoon. In the cartoon a bunch of not-cool railroad tycoons were pulling a train car to the legislature. The rest of the country just knew it was cold in Minnesota and there must be gophers there. That evolved into the Gopher State moniker, the Gopher mascot of the U and later the Golden Gophers. The Golden Gophers were once a powerhouse in college football. They have won the 8th most National titles of any University in college football, although they haven’t nailed one since 1960. Mark McCahill turned 4 years old that year. But by the late 80s he was in his thirties. McCahill had graduated from the U in 1979 with a degree in Chemistry. By then he managed the Microcomputer Center at the University of Minnesota–Twin Cities. The University of Minnesota had been involved with computers for a long time. The Minnesota Education Computing Consortium had made software for schools, like the Oregon Trail. And even before then they’d worked with Honeywell, IBM, and a number of research firms. At this point, the University of Minnesota had been connected to the ARPANET, which was evolving into the Internet, and everyone wanted it to be useful. But it just wasn’t yet. TCP/IP maybe wasn’t the right way to connect to things. I mean, maybe bitnet was. But by then we knew it was all about TCP/IP. They’d used FTP. And saw a lot of promise in the tidal wave you could just feel coming of this Internet thing. There was just one little problem. A turf war between batch processed mainframes had been raging for a time with the suit and tie crowd thinking that big computers were the only place real science could happen and the personal computer kids thinking that the computer should be democratized and that everyone should have one. So McCahill writes a tool called POPmail to make it easy for people to access this weird thing called email on the Macs that were starting to show up at the University. This led to his involvement writing tools for departments. 1991 rolls around and some of the department heads around the University meet for months to make a list of things they want out of a network of computers around the school. Enter Farhad Anklesaria. He’d been working with those department heads and reduced their demands to something he could actually ship. A server that hosted some files and a client that accessed the files. McCahill added a search option and combined the two. They brought in four other programmers to help finish the coding. They finished the first version in about three weeks. Of those original programmers, Bob Alberti, who’d helped write an early online multiplayer game already, named his Gopher server Indigo after the Indigo Girls. Paul Lindner named one of his Mudhoney. They coded between taking support calls in the computing center. They’d invented bookmarks and hyperlinks which led McCahill to coin the term “surf the internet” Computers at the time didn’t come with the software necessary to access the Internet but Apple was kind enough to include a library at the time. People could get on the Internet and pretty quickly find some documents. Modems weren’t fast enough to add graphics yet. But, using the Gopher you could search the internet and retrieve information linked from all around the world. Wacky idea, right? The world wanted it. They gave it the name of the school’s mascot to keep the department heads happy. It didn’t work. It wasn’t a centralized service hosted on a mainframe. How dare they. They were told not to work on it any more but kept going anyway. They posted an FTP repository of the software. People downloaded it and even added improvements. And it caught fire underneath the noses of the University. This was one of the first rushes on the Internet. These days you’d probably be labeled a decacorn for the type of viral adoption they got. The White House jumped on the bandwagon. MTV veejay Adam Curry wore a gopher shirt when they announced their Gopher site. There were GopherCons. Al Gore showed up. He wasn’t talking about the Internet as though it were a bunch of tubes yet. So then Tim Berners-Lee had put the first website up in 1991, introducing html on Gopher and what we now know as the web was slowly growing. McCahill then worked with Berners-Lee, Marc Andreessen of Netscape, Alan Emtage and former MIT whiz kid, Peter J. Deutsch. Oh and the czar of the Internet Jon Postel. McCahill needed a good way of finding things on his new Internet protocol. So he invented something that we still use considerably: URLs, or Uniform Resource Locators. You know when you type http://www.google.com that’s a URL. The http indicates the protocol to use. Every computer has a default handler for those protocols. Everything following the :// is the address on the Internet of the object. Gopher of course was gopher://. FTP was ftp:// and so on. There’s of course more to the spec, but that’s the first part. Suddenly there were competing standards. And as with many rapid rushes to adopt a technology, Gopher started to fall off and the web started to pick up. Gopher went through the hoops. It went to an IETF RFC in 1993 as RFC 1436, The Internet Gopher Protocol (a distributed document search and retrieval protocol). I first heard of Mark McCahill when I was on staff at the University of Georgia and had to read up on how to implement this weird Gopher thing. I was tasked with deploying Gopher to all of the Macs in our labs. And I was fascinated, as were so many others, with this weird new thing called the Internet. The internet was decentralized. The Internet was anti-authoritarian. The Internet was the Subpop records of the computing world. But bands come and go. And the University of Minnesota wanted to start charging a licensing fee. That started the rapid fall of Gopher and the rise of the html driven web from Berners-Lee. It backfired. People were mad. The team hadn’t grown or gotten headcount or funding. The team got defensive publicly and while traffic continued to grow, the traffic on the web grew 300 times faster. The web came with no licensing. Yet. Modems got faster. The web added graphics. In 1995 an accounting disaster came to the U and the team got reassigned to work on building a modern accounting system. At a critical time, they didn’t add graphics. They didn’t further innovate. The air was taken out of their sales from the licensing drama and the lack of funding. Things were easier back then. You could spin up a server on your computer and other people could communicate with it without fear of your identity being stolen. There was no credit card data on the computer. There was no commerce. But by the time I left the University of Georgia we were removing the gopher apps in favor of NCSA Mosaic and then Netscape. McCahill has since moved on to Duke University. Perhaps his next innovation will be called Document Devil or World Wide Devil. Come to think of it, that might not be the best idea. Wouldn’t wanna’ upset the Apple Cart. Again. The web as we know it today wasn’t just some construct that happened in a vacuum. Gopher was the most popular protocol to come before it but there were certainly others. In those three years, people saw the power of the Internet and wanted to get in on that. They were willing it into existence. Gopher was first but the web built on top of the wave that gopher started. Many browsers still support gopher either directly or using an extension to render documents. But Gopher itself is no longer much of a thing. What we’re really getting at is that the web as we know it today was deterministic. Which is to say that it was almost willed into being. It wasn’t a random occurrence. The very idea of a decentralized structure that was being willed into existence, by people who wanted to supplement human capacity or by a variety of other motives including “cause it seemed cool at the time, man.” It was almost independent of the action of any specific humans. It was just going to happen, as though free will of any individual actors had been removed from the equation. Bucking authority, like the department heads at the U, hackers from around the world just willed this internet thing into existence. And all these years later, many of us are left in awe at their accomplishments. So thank you to Mark and the team for giving us Gopher, and for the part it played in the rise of the Internet.


The Apache Web Server

     10/29/2019

Welcome to the History of Computing Podcast, where we explore the history of information technology. Because understanding the past prepares us for the innovations of the future! Today we’re going to cover one of the most important and widely distributed server platforms ever: The Apache Web Server. Today, Apache servers account for around 44% of the 1.7 Billion web sites on the Internet. But at one point it was zero. And this is crazy, it’s down from over 70% in 2010. Tim Berners-Lee had put the first website up in 1991 and what we now know as the web was slowly growing. In 1994 and begins with the National Center for Supercomputing Applications, University of Illinois, Urbana-Champaign. Yup, NCSA is also the organization that gave us telnet and Mosaic, the web browser that would evolve into Netscape. After Rob leaves NCSA, the HTTPdaemon goes a little, um, dormant in development. The distress had forked and the extensions and bug fixes needed to get merged into a common distribution. Apache is a free and open source web server that was initially created by Robert McCool and written in C in 1995, the same year Berners-Lee coined the term World Wide Web. You can’t make that name up. I’d always pictured him as a cheetah wearing sunglasses. Who knew that he’d build a tool that would host half of the web sites in the world. A tool that would go on to be built into plenty of computers so they can spin up sharing services. Times have changed since 1995. Originally the name was supposedly a cute name referring to a Patchy server, given that it was based on lots of existing patches of craptostic code from NCSA. So it was initially based on NCSA HTTPd is still alive and well all the way up to the configuration files. For example, on a Mac these are stored at /private/etc/apache2/httpd.conf. The original Apache group consisted of * Brian Behlendorf * Roy T. Fielding * Rob Hartill * David Robinson * Cliff Skolnick * Randy Terbush * Robert S. Thau * Andrew Wilson And there were additional contributions from Eric Hagberg, Frank Peters, and Nicolas Pioch. Within a year of that first shipping, Apache had become the most popular web server on the internet. The distributions and sites continued to grow to the point that they formed the Apache Software Foundation that would give financial, legal, and organizational support for Apache. They even started bringing other open source projects under that umbrella. Projects like Tomcat. And the distributions of Apache grew. Mod_ssl, which brought the first SSL functionality to Apache 1.17, was released in 1998. And it grew. The Apache Foundation came in 1999 to make sure the project outlived the participants and bring other tools under the umbrella. The first conference, ApacheCon came in 2000. Douglas Adams was there. I was not. There were 17 million web sites at the time. The number of web sites hosted on Apache servers continued to rise. Apache 2 was released in 2004. The number of web sites hosted on Apache servers continued to rise. By 2009, Apache was hosting over 100 million websites. By 2013 Apache had added that it was named “out of a respect for the Native American Indian tribe of Apache”. The history isn’t the only thing that was rewritten. Apache itself was rewritten and is now distributed as Apache 2.0. there were over 670 million web sites by then. And we hit 1 billion sites in 2014. I can’t help but wonder what percentage collections of fart jokes. Probably not nearly enough. But an estimated 75% are inactive sites. The job of a web server is to serve web pages on the internet. Those were initially flat HTML files but have gone on to include CGI, PHP, Python, Java, Javascript, and others. A web browser is then used to interpret those files. They access the .html or .htm (or other one of the other many file types that now exist) file and it opens a page and then loads the text, images, included files, and processes any scripts. Both use the http protocol; thus the URL begins with http or https if the site is being hosted over ssl. Apache is responsible for providing the access to those pages over that protocol. The way the scripts are interpreted is through Mods. These include mod_php, mod_python, mod_perl, etc. The modular nature of Apache makes it infinitely extensible. OK, maybe not infinitely. Nothing’s really infinite. But the Loadable Dynamic Modules do make the system more extensible. For example, you can easily get TLS/SSL using mod_ssl. The great thing about Apache and its mods are that anyone can adapt the server for generic uses and they allow you to get into some pretty really specific needs. And the server as well as each of those mods has its source code available on the Interwebs. So if it doesn’t do exactly what you want, you can conform the server to your specific needs. For example, if you wanna’ hate life, there’s a mod for FTP. Out of the box, Apache logs connections, includes a generic expression parser, supports webdav and cgi, can support Embedded Perl, PHP and Lua scripting, can be configured for public_html per-user web-page, supports htaccess to limit access to various directories as one of a few authorization access controls and allows for very in depth custom logging and log rotation. Those logs include things like the name and IP address of a host as well as geolocations. Can rewrite headers, URLs, and content. It’s also simple to enable proxies Apache, along with MySQL, PHP and Linux became so popular that the term LAMP was coined, short for those products. The prevalence allowed the web development community to build hundreds or thousands of tools on top of Apache through the 90s and 2000s, including popular Content Management Systems, or CMS for short, such as Wordpress, Mamba, and Joomla. * Auto-indexing and content negotiation * Reverse proxy with caching * Multiple load balancing mechanisms * Fault tolerance and Failover with automatic recovery * WebSocket, FastCGI, SCGI, AJP and uWSGI support with caching * Dynamic configuration * Name- and IP address-based virtual servers * gzip compression and decompression * Server Side Includes * User and Session tracking * Generic expression parser * Real-time status views * XML support Today we have several web servers to choose from. Engine-X, spelled Nginx, is a newer web server that was initially released in 2004. Apache uses a thread per connection and so can only process the number of threads available; by default 10,000 in Linux and macOS. NGINX doesn’t use threads so can scale differently, and is used by companies like AirBNB, Hulu, Netflix, and Pinterest. That 10,000 limit is easily controlled using concurrent connection limiting, request processing rate limiting, or bandwidth throttling. You can also scale with some serious load balancing and in-band health checks or with one of the many load balancing options. Having said that, Baidu.com, Apple.com, Adobe.com, and PayPal.com - all Apache. We also have other web servers provided by cloud services like Cloudflare and Google slowly increasing in popularity. Tomcat is another web server. But Tomcat is almost exclusively used to run various Java servers, servelets, EL, webscokets, etc. Today, each of the open source projects under the Apache Foundation has a Project Management committee. These provide direction and management of the projects. New members are added when someone who contributes a lot to the project get nominated to be a contributor and then a vote is held requiring unanimous support. Commits require three yes votes with no no votes. It’s all ridiculously efficient in a very open source hacker kinda’ way. The Apache server’s impact on the open-source software community has been profound. It iis partly explained by the unique license from the Apache Software Foundation. The license was in fact written to protect the creators of Apache while giving access to the source code for others to hack away at it. The Apache License 1.1 was approved in 2000 and removed the requirement to attribute the use of the license in advertisements of software. Version two of the license came in 2004, which made the license easier for projects that weren’t from the Apache Foundation. This made it easier for GPL compatibility, and using a reference for the whole project rather than attributing software in every file. The open source nature of Apache was critical to the growth of the web as we know it today. There were other projects to build web servers for sure. Heck, there were other protocols, like Gopher. But many died because of stringent licensing policies. Gopher did great until the University of Minnesota decided to charge for it. Then everyone realized it didn’t have nearly as good of graphics as other web servers. Today the web is one of the single largest growth engines of the global economy. And much of that is owed to Apache. So thanks Apache, for helping us to alleviate a little of the suffering of the human condition for all creatures of the world. By the way, did you know you can buy hamster wheels on the web. Or cat food. Or flea meds for the dog. Speaking of which, I better get back to my chores. Thanks for taking time out of your busy schedule to listen! You probably get to your chores as well though. Sorry if I got you in trouble. But hey, thanks for tuning in to another episode of the History of Computing Podcast. We’re lucky to have you. Have a great day!


(OldComputerPods) ©Sean Haas, 2020