between projects
Mood: contemplative
Posted on 2006-11-06 08:59:00
Tags: netflixprize projects proandcon
Words: 666

Vote tomorrow!

(I've been kinda unhappy and antsy the last few days, so the below is as much for myself as for anyone else...)
So I'm generally happiest when I have a side project that I'm working on outside of work, especially when work stuff is boring (i.e. now). I have been working on the Netflix Prize, but as of this evening (when my last run finishes and produces the same unimpressive results) I'll be done. Things that made it a good project:

- I used C++, which I haven't used in a little while
- The whole contest framework gave me a way to quantify how well the project was going
- Using real data with real movies is always neat

Not so good things:
- Because there was so much data, the nature of the project was to work on something for a few minutes or an hour and then leave it running for days. This meant at any given point in time I probably couldn't work on the project, and my computer was tied up so doing other things was painful.
- I like projects that other people can use (vanity?), but there's not much one can use about this one. I'm thinking of putting up a related movies finder but that might violate the TOS of the contest.
- Because people are so good it became pretty darn hard to get on the leaderboard.
- It was fun to play around with algorithms, but apparently you need a good idea or to do a lot of research to find a good one, which I didn't have/do.

Anyway, so like I said I'm putting this project to bed this evening. My question is, what will I work on next? I have some ideas stored up but I'm not thrilled with any of them.

LiveJournal backup - provide some way to backup all LJ posts and comments to those posts.
Minus: There's already a decent way to do this, except for comments.
Plus: This would be potentially useful for myself and others.
Minus: I have a hard time figuring what outputs to produce: one giant page with tons of entries and comments? A zip file with pages for each month?
Minus: Because of friends-locked posts, you would only want to view it on your own computer (and not publish it or anything) unless I somehow tied LJ users in with viewers of the page, which is not going to happen.
Plus: Ooh, I could provide some interesting statistics on moods and such. Or even some kind of randomish text from your LJ posts. Maybe.

Some kind of World of Warcraft mod
Plus: I've already done this, so I have the basics down at least.
Minus: I've already done this so it wouldn't be as interesting.
Minus: I don't have an idea of what kind of mod would be useful - there are lots of them out there already.
Minus: Developing for WoW is kind of a pain, since you have to open WoW a lot and develop at the same time, which is kinda slow.
Minus: I'm already spending a lot of free time in WoW...on the one hand this means I might be using whatever mod I make a lot, but I'd also prefer a non-WoW-related project.

Adding annotations to the baseball Win Expectancy Finder graph.
Minus: This project already exists, so it's less interesting to add little features to it.
Minus: It's not baseball season anymore, making this seem less appealing.
Minus: The main problem is coming up with a placement algorithm for annotations so they don't run into each other or the graph, which sounds pretty hard.

Add some kind of Getting Things Done style tickler list to my todo list
Plus: I kinda use the todo list, and working on it might be more incentive for me to use it again.
Minus: ...but I don't really use Getting Things Done stuff anymore. Without that, the todo list is pretty good as it is.
Minus: Again, not a new project.


13 comments

Comment from llemma:
2006-11-06T09:20:19+00:00

A few years ago my sister was looking for very simple, object-oriented, effective family history software. Not a superexpensive geneology package that would help you track primary-source documents in overseas libraries, but a way to jot down who was married to whom, link in photos and audio recordings and stories, and maybe even keep track of conflicting or questionable information. I'd love to do that someday but I know I don't have time, and it would be a widely useful project.

Comment from girdsman:
2006-11-06T17:18:47+00:00

Try geneweb

This software may require a little knowledge to setup, but has everything you described, and is open source (hence free). It uses a web interface, and can be set either for individual usage, or as an internet server. With minimal understanding of html, it is possible to link whatever you want (after you set up apache web server). Basically, it can be slightly challenging to get started, but once setup, trivial to use and maintain, fully featured, and free.


http://cristal.inria.fr/~ddr/GeneWeb/en/index.html

Comment from llemma:
2006-11-06T17:55:40+00:00

Cool -- thanks!!

Comment from yerfdogyrag:
2006-11-06T10:04:13+00:00

So, here's one. It's more of a difficult design problem than a difficult programming problem. There are a lot of situations where people get together: lunch, game nights, parties, quartetting, etc. What would be nice is to have a place where you can say, "I can game on Tuesday and Thursday evening", and other people can do the same, and an Event pops up.

Now, Events need to have their own dependancies before firing. For instance, every Game Night needs a Host. A quartet needs tenor, lead, baritone, bass. Also, you need to put in how often (maximum) that you want that game night to fire. That way you're not doing game night every night. It also means that you can say, "I'll host maximum every 6 weeks".

And inviduals need to be able to prioritize their Events. I want the Northern Game Night before the Southern Game night.

This also works for restaurants. I can say, "Chili's every two weeks max". Every day at 10:00, an email is sent out with the "best" choice of restaurant(s)/people.

Hmmmm... I wonder if you should be able to "I will go to any restaurant unless xxx goes along". Naaaa.

Oh, and you'll want to do this in django.

Comment from wildrice13:
2006-11-06T16:55:28+00:00

I love this idea too. I envision a spreading database where people get to meet friends of friends and get more organized about forming wonderful groups for various purposes and everyone's happier and more social.

Then again, that's rather idealistic. Surely the higher the number of people, the higher the potential for drama (your "I will go to any restaurant unless xxx goes along" comment made me laugh...). But such is life.

Comment from djedi:
2006-11-06T19:42:36+00:00

Yeah, such a thing woudl be pretty useful and interesting. I guess when I picture "the future", I already sorta had in mind that we'll have computer calendars that can do stuff like this and make socializing much easier.

Comment from wildrice13:
2006-11-06T20:02:35+00:00

THE FUTURE IS NOW!!

well, if greg makes the app anyway :P

Comment from gregstoll:
2006-11-07T08:17:42+00:00

This is indeed a neat idea. If I can figure out how to do it and such, I'll make it my next project. Also, I know you've prodded me about django in the past, and it does look like a neat system, and I have been wanting a more database-y project. And Python is good, although I'm more in a Ruby state of mind ATM...

Comment from amorphousplasma:
2006-11-06T10:55:00+00:00

OMG do the Livejournal thing. Better yet, figure out a way to make all your past entries friends only with one mouse click.

Can you have it export to a calendar with entry titles on the days you made entries? And then you click on the title (or beginning of entry) to read it?

Comment from wildrice13:
2006-11-06T16:53:32+00:00

That would be excellent. Sounds kinda involved, but VERY useful.

Comment from gregstoll:
2006-11-07T08:16:14+00:00

OK, I'm gonna work on the LJ backup thing.

For making everything friends only, you might want to check out the various clients available. If you can't find anything that works I can try to add that functionality...

So for the calendar thing, do you mean export to a "real" calendar? (Outlook, Google Calendar or something) Or do you mean just a page like my archive page?

Comment from amorphousplasma:
2006-11-08T15:14:48+00:00

Either one. Google would be cool. Hey that event idea up there is ingenious.

Comment from anonymous:
2006-11-12T04:31:18+00:00

Hi,
I will suggest you to not give up on netflix yet. Even, I have been disappointed at my inability to get into the leaderboard. I am always a week too late for the scores on the leaderboard. I am now able to get an RMSE of 0.958-0.962, that would have put me on the leaderboard a week back.

But, I am still trying. I would suggest you to look at alternatives to your current strategy. A plain correlation based analysis(using clustering to reduce finding correlations only for related movies), didn't take me far either, and so I am now trying to use SVD to do dimensionality reduction, and then trying various alternatives.

Suprisingly, some people have claimed that correlation based analysis took them to an RMSE of 0.94ish. I wonder what special trick did they use.

This backup was done by LJBackup.