Posterous theme by Cory Watilo

Tutorial for Heather Payne's project - Part 1

The Project

I recently discovered Heather Payne's excellent blog. She is the founder of the Ladies Learning Code project that teaches introductory technical skills to women interested in coding.

To practice her skills, she has proposed a project for herself. The details below are excerpted from Heather Payne's blog post:

  • It’s a website.
  • Visitors enter their first name and last name in a box.
  • The site classifies you based on a) the number of Ladies Learning Code events you’ve been to, and b) the types of Ladies Learning Code events you’ve been to
    • For example, if you’ve been to one Ladies Learning Code workshop, the site will tell you that you’re “en route to coding mastery” in some hilarious way. If you’ve been to only the launch party, the site will tell you you’re a “party animal” in some hilarious way. If you’ve been to three workshops, it will show you an image of a really happy cat, etc. I’ll probably come up with some other surprise categories for certain people, if I have time and can figure out how to do it.
  • It’s going to include animated GIFs and be cheesy by design
  • Goal: to get people who have not attended a Ladies Learning Code event to join our mailing list.

This project sounds great! But how does one go about building a website like this? There are a million and one technologies, tools, approaches, and processes to consider, and it gets all too much way too quickly! I thought I'd write a tutorial that shows how I would approach this problem for an advanced beginner's point of view. I am not a programmer by trade - I'm actually an English lit major.

 

Breaking it down

I think Heather has a great mind and she's already broken things down into steps. I'll take apart the problem some more.

First of all - entering first and last names in a box. That sounds like an Internet form. Where do we see these? On almost every single website out there! Login boxes to gmail or paypal, when we buy things online, subscribe to a newsletter - wherever you can fill information into a website, that's a web form.

 

What information do we want?

So there are two pieces to this puzzle. First of all is the information - what do we want? what are the qualities of the information we're trying to capture? This sounds like a stupid question - duh - first and last names, how complicated can it be? If you're so smart, why don't you tell me what is a first and a last name? Can they be

  • English letters only or characters from foreign languages as well?
  • Are numbers or spaces allowed? For example some people use names like happycat25. Some people have multi-word names that are not hyphenated like my friend Mary Jean.
  • How long are these names usually? 25 characters? 50 characters?

Wow, how come we have to worry so much about these things? Because fundimentally, computers are not sentient beings (so far) and don't understand what names are. We understand what a name is though.

Now we narrow things down. I want my project to take

  • English and French characters for now (world languages are great, but it's too hard right now for me to build)
  • Spaces, numbers, dashes and the usual symbols that you find in names are allowed. But things like @, %, ^, = should be filtered out. Who knows what people will put into that box! Note: this is a security issue - to be explained later on when we work with the data.)
  • First and Last Names should have a max of 20 characters each.

Now we have some specs! We will keep revisiting these three "rules" as we proceed, to check if our assumptions about names are still valid.

 

Where do we keep the information?

The second part of the question is where do we keep this information?

Many people will jump straight into the deep end of the pool with the usual super technical answers: a database! MySQL! PostgreSQL! redis! MS Access! And now we descend into a pile of jargon that will only complicate things.

What we really need is a place to keep stuff. Ideally it's some kind of table. Example:

Record # | First Name | Last Name    | number of events  | types of events
1          Celine       Dion           1                   party
2          Michael      Bublé          2                   workshop
3          Rebecca      Cunningham     3                   party, workshop
4          Kit          Cloudkicker    4                   party, workshop


Yes I wish all these superstars came to Ladies Learning Code!

Wait a second - that looks suspiciously like a spreadsheet - because that's what it is!

Databases are spreadsheets that are sliced and diced up in ways that make it easy to look for information, so that's why most people jump straight into picking a database. But in this case, the information is really simple. We're not planning a wedding for 500 guests - there are no allergies and no people who can't sit next to each other. We have a list of people, and we want to check 1) if they exist on the list, and 2) call out the number and type of events they've been to.

There are a lot of tools that we can use to solve this thorny problem. BUT I will go ahead and propose that we use Google Spreadsheets!

Why Google Spreadsheets?

  1. It looks like excel so most people already know how put the data in.
  2. It's already online so we don't have to worry about setting up servers.
  3. It's protected by your google password so we don't have to worry about people doing bad things with your list, and very importantly,
  4. There seems to be a way for code to interact with Google spreadsheets.

Number 4 is very important. In computer jargon, it's called an Application Programming Interface (API). Not every online service has an API, and some services offer better API's than others. I don't know how good the Google spreadsheet API is. Will it let us do all the things we need to do? Who knows! Maybe it won't work and we'll have to find another way! That's part of the fun of programming. It's like driving around, finding adventure in an unfamiliar city.

PS: For coders looking for something more advanced, another good fit for this kind of data is redis, which is a simple storage device - simpler than a database, but more capable than a flat file.

Now I am going to stop here. The next step is to take all these ideas that we have, and start putting them into technical terms. We will be looking up and reading a lot of documentation! Stay tuned!

 

Top Tips for this lesson:

  • Think about the problem, and then choose the tools.
  • Be incredibly specific about what you want. The computer is not afraid of giving you everything that you want, all the time, and it's nag-proof.
  • Always be on the lookout for API's. It's like people watching - you never know when you'll just run into a service that has a gem of an API.

 

 

An idea a day - indexed digital magazine library

As a participant in the Canadian small press / literary magazine community, I subscribe to a lot of magazines. But I have a big problem. My magazines pile up everywhere. I've ran out of shelf space about a year ago, and now I have magazines on the floor, behind my monitor, on the windowsill, and peaking out from between pots and pans. Since I mostly subscribe to quarterlies, I also get a glut of them all at the same time, making it really difficult to pick out the one I want to read.

I want to find a way to create an electronic index of my magazines. The ideal situation is to scan every magazine that I own into a privately held collection, and to convert the text into a machine readable format for cross referencing. I imagine public libraries have similar systems to deal with their backlog, but they usually use large expensive archival services like LexisNexis, which is out of reach of the ordinary individual. I admit, the idea is not new: Google Books does the same thing to books, and they are trying to get patent on scanning and indexing newsprint and magazines. I want something that is much more personal - something under my direct control. E-readers and magazine websites are not ideal, because they cut out the design element and instead treat content as discrete blocks: a chunk of text, a few images, a video or audio clip here and there - definately not reflective of the designed magazine package that I love.

I tried to use Abbyy, an OCR software suite, to scan a copy of Ricepaper Magazine, but that didn't do much good because of the amount of design integrated into the magazine. I need something a lot more robust - something that can handle small and large fonts mixed in with foreign languages, with design elements that are not only decorative pieces but also form the bulk of meaning. Google encountered the same problems - their software had difficulties in recognizing multi-column text, large banner style headers, etc.

This is a great opportunity for the publishing industry - no one really understands how to parse design heavy packages of content. Publishers often have access to the original digital data files for magazines and books, but the value is locked within and never used.

A much wider challenge, however, is to create a collective search engine that indexes all Canadian literary magazines with the ability to recommend content depending on search terms - then enticing the searcher to purchase the content.

This sounds like a "me-too" idea - a riff off of the recommendation engine née search engine that Google brought into the world, but with a major distinction: It is our niche. We need to do this ourselves because we understand the market and the product intimately. We need to, at all cost, maintain that intimate bond with our customers, fans, readers, authors, and community. Without those bonds, our industry wouldn't exist.

If we don't run our own services - and confront our own fears about copyright, consumer rights, authorship, and the enjoyment of the media that we produced, that we own as the collective Canadian small press - someone else will do it for us. Google is eyeing the magazine business, licking its robotic lips as it dreams about how much ad revenue it can add to its bottom line by digitizing magazines the way it digitized books.

When that happens, we won't get a choice. So my rallying call is for every independent publisher in Canada to take a second look at their archives - to scan them, digitize them, find the original electronic copies, put them in a database, index them, even the very simple idea of going through the back issues and out-of-print items and adding keywords, tags, and metadata. Your own employees, authors, editors and volunteers can use this very valuable resource to make your business and your art a stronger presence in the digital world.

The result of this kind of effort can be a private repository or a public free-for-all - but this is an issue that we all need to take a moment in our busy day to address.

I will try to build a prototype system using the magazines that I have in my possession. 

I don't want to ignore copyrights and authorship. But I do want publishers to start taking care of their own copyrights and do their own indexing, before someone with a much heavier commercial interest takes a greedy look at our content.

David Winer posted a very similar entry in his blog titled Big change in the tech world (Scripting News). It's well worth the read - now it's up to us to do something about it. Quoted here is a truely honest opinion of the tech / media landscape:

If you're in the media industry, stop partnering with the tech industry, and hire away some of their best people and give them power to run your business. This is how your boat will stay afloat. Pretending these companies are your friends is ridiculous. They don't care about you. Look at how well they're doing monetizing your content. This is probably what you need to learn to do, and there's no time to learn. Hire their people away and get ready to compete.

Business Model Generation collaborative book @business_design #bmgen

I only contributed a minimal amount to this project, but I'm proud of what the community was able to come up with. The internal process was a very fluid and enlightening example of a completely collaborative editorial and design process. Kudos to Alex for organizing all of this!

http://www.businessmodelgeneration.com/

http://www.businessmodelalchemist.com/

Video of the production of Business Model Generation - visit at the book binder in Utrecht, NL

Business Model Generation - visit at the book binder in Utrecht, NL from Alex Osterwalder on Vimeo.

Word on the Street 2009 Toronto notes

September 27, 2009 - Queen's Park, Toronto, ON

I spent a busy and thought-provoking afternoon at Word on the Street on Sunday. I missed WotS for the last two years, and I was determined to make it to the national book & magazine fest this year.

I met up with Alastair Cheng from the Literary Review of Canada at the magazines area on the eastern arm of Queen's Park Crescent, and made our rounds through the booths.

Here is a brief summary of interesting ideas and people met:

1) Canada Council for the Arts: I snagged a print copy of their 2007/8 Annual Report, which also came with a CD with English and French PDFs. The latest 2008/9 report was released Sept 17, 2008, and the full set of documents are available at:  http://www.canadacouncil.ca/aboutus/organization/annualreports/. The search function on their website was not working and spewed out error messages, so I had to run a google search using site:canadacouncil.ca in order to find anything.

After two and a half years of dissecting annual and quarterly reports for CNW Group, I have new respect for the lowly annual report and associated financials. A project idea would be to build an online annual report for the CCA in the vein of PotashCorp's report done by zu.com

2) rabble.ca - conversations with Kim Elliot, publisher

At the rabble booth, we talked about Media Democracy Day and their upcoming media mapping project that, as far as I can gather, seeks to map out the creators and consumers of independent media in the Greater Toronto Area. It's a great idea for finding out who is in our community, what their interests are, and how we can help support the growth of relationships and linkages between indy media. This is an idea dear to me, and I'll definitely be looking for more information about this project. The discussion expanded to include university media projects such as futurity.org, where academic institutions host their own portal to highlight scientific or academic news that otherwise wouldn't get attention from mainstream media.

Other topics explored with Kim and other rabble.ca volunteers include a potential unified online ad exchange for Canadian independent media/news organizations who are all using OpenAds (which is now OpenX)

Apparently, about two years ago, there was a project on the table that sought to bring together organizations like rabble.ca, the Tyee, etc, to cooperate in web ad sales. This initiative petered out because there wasn't enough interest, but perhaps with the rise of web ads in the last two years, the idea can be revisited. Anyone who has more information about this, please email me, twitter me, or feel free to leave a comment here.

3) Ran into Jordan Himelfarb (from the Mark) at the maisonneuve tent. The Mark is an up and coming new site that's hitting some of the right buttons. I've got to investigate further, but at first glance, they're using getsatisfaction.com for feedback from their readers. It's something that I've wanted to do for the longest time: to use getsatisfaction as a collaborative editorial brain storming tool.

4) New subscriptions purchased

The Walrus (recently, they were looking for an assistant editor - closing date was Sept 24th)

musicworks (Comes with a CD!)

Broken Pencil Subscription + their fiction anthology can'tlit: fearless fiction from broken pencil magazine

Toronto Life

Canadian Art

and of course

maisonneuve

How was your Word on the Street experience? If I were only a little bit ambitious, I would build an after-show vertical search engine that aggregated the #wots09 hashtag and associated media to compile the ultimate, automatically updated, national coverage for this great event. But alas. I don't have the time nor the technical chops to do it, and thus ends my review of Word on the Street Toronto.

Up next week: Reconstruction of a magazine website. - Stay tuned.

-jl

 

Using Opensocial to map 3rd person relationships

Posted to the Opensocial-Community group @ Sept 16, 2009 - OpenSocial Community > Using opensocial to describe the social network of a 3rd party

Hello all,

I apologize if the questions in this post has been addressed before. I was wondering if OpenSocial could be used, not to describe 1st person relationships (ie the quintessential Facebook profile), but to describe 3rd party relationships that had been discovered through research, freedom of information act requests, or some sort of machine-learned filtering.

I'm in the media/journalism business, and in the financial press, we need to keep track of who sits on who's board of directors, and the  large number of C-level executive musical chairs that happens around us as industries goes through acquisitions and contractions.

I'd like to use opensocial to build a researcher's map of this type of social relations, sort of the linkedin profile that we as the public build for a particular person: the journalist's black book - open for all to see and all to fathom the consequences.

I can see this being a public good as well, when applied on our politicians, who come out of one bad senate assignment only to become  the next chairperson of another obtusely named governmental body.

It'd be very interesting to see how far the social graph concept can be pushed when applied to the complexities of business executives or  politicians.

Jonathan Lin