Archive for the ‘Programming’ Category

Shouting From The Clouds

December 2nd, 2008 by ScottK | 1 Comment | Filed in JavaScript

A few months ago in Orlando a new idea in transportable widgets was introduced to the developer community. The name of this concept is CloudShout . During this years IZEAFest , CloudShout has developed into a fully functional system and introductory keys were given out to not only developers but blog owners as well.

Being in closed alpha at the moment CloudShout is once again opening looking for more developers and blog operators. Thus allowing me to give out invite keys to those that are interested.

Anyone who surfs, blogs, or develops for the web would find CloudShout highly useful. The idea behind the transportable widget is that you can go from site to site with a set of applications that you or others can use. Instant messaging, LastFM, even a nice game of checkers!

From the surfers point of view they would be able to instant message friends that are also online; as long as both are on CloudShout equipped sites. With your apps in hand you can extend your user experience in real time communication with your friends or the other site visitors.

The first site to adopt CloudShout is SocialSpark and since CloudShout is in closed alpha registration must be done at SocialSpark to gain the full profile potential.

After setting up a blogger profile you can simply surf about your business. Upon happening on a CloudShout equipped site you’ll notice your display name as well as other registered users. The nice thing is through polling requests you can watch others come and go from the site. Contact these users as you wish in a variety of ways.

Each user can choose from several applications to install to increase your or others enjoyment. The big benefit here is that you can install any apps on any CloudShout equipped site you are on. When viewing the profile of a person you can choose to install apps that they have and you want; never leaving the site you are on.

As a blog operator it’s nice to be able to instantly communicate with your visitor as they come and go. Visitors can leave you messages apart from comments, and the benefit of not exposing the owners email address as well. Blog specific apps can be installed, yes via the site, that will potentially have visitors stay longer.

For developers this is another opportunity to get your name out. Just as Netvibes, iPhone, Nokia, Facebook, etc, allow developers to create applications for their community to use, CloudShout is no exception. Using a predefined widget architecture developers can quickly develop applications and submit applications. After a review process the application, if approved, would be made available to the community at large.

I have keys to give out to those interested in developing and or using CloudShout as it is still in closed alpha. Leave me a comment and I will send keys out to the email you leave.

Tags: , , , ,

Finding The True Domain Using Ruby On Rails

November 3rd, 2008 by ScottK | 3 Comments | Filed in ruby

So here’s the problem your Ruby on Rails takes on url and another url. You then need to compare these two inputs to make sure they reside on the same domain. http://techraving.com has the same true domain as https://news.techraving.com. However, http://news.techraving.com does not reside on http://www.yahoo.com. What about http://localhost:8080 as being the same domain as http://localhost:443/index.html.

You could write an overly complication set of methods to start detecting string positions, or even incorporating a bunch of gsub regexs to try and weed out the unwanted components. Or you could take the easy route and call the URI module.

The URI module easily breaks the string url into it’s different parts which you can use to further refine using array’s into what you need for comparison. Taking these two domains:

http://www.techraving.com/about/

http://news.techraving.com/about/

We need to find out whether they are on the same domain. So:

first_domain = URI.parse("http://techraving.com/about/")
second_domain = URI.parse("http://news.techraving.com/about/")

Now first_domain and second_domain are both distinct instances of the URI object. The next step is to let these objects return us the host by calling “host”:

>> first_domain.host
=> "techraving.com"

>> second_domain.host
=> "news.techraving.com"

That was a lot of work already done for us. But we are not done yet! we still need to find the true domain. Or if you notice that would be the last two index of a possible array. So let’s split into an array:

>> first_array = first_domain.host.split(".")
=> [ "techraving", "com"]

>> second_array = second_domain.host.split(".")
=> ["news", "techraving", "com"]

So the array’s are what we expect in that the last two are the parts we need. Now how best to grab them. Easy enough!

unless first_array.size == 1
  first_true_domain = first_array[first_array.size - 2] << "." << first_array[first_array.size - 1]
else
  first_true_domain = first_array[0]
end

unless second_array.size == 1
  second_true_domain = second_array[second_array.size - 2] << "." << second_array[second_array.size - 1]
else
  second_array_true_domain = second_array_array[0]
end

Now why can the array size equal 1? That’s because:

>> local_array = URI.parse("http://localhost:80").host.split(".")
=> ["localhost"]

and,

rent a car bulgaria>> local_array = URI.parse("http://localhost:8080").host.split(".")
=> ["localhost"]

Both are 1 array size and yet the same domain.

So now that we have that out of the way why the index subtraction in the indexes. Because array.size – 2 returns the domain name and array.size – 1 returns the generic top-level domain we can know put the two together by concatenating these two with the “.” to get the true domain.

Now (first_true_domain == second_true_domain) comparisons can be made without sub-domain or port problems. And a ton less string position/replace code.

Calculating User Reputation in Your App Using Bayesian

November 3rd, 2008 by ScottK | No Comments | Filed in Programming

Almost everyone that has had the opportunity to meet you in some form or fashion: in person; read your work; watched you on TV, has made an opinion about you; given you their reputation score. Likewise I’m sure you’ve determined the reputation of everyone you’ve meet as well. With reputation you can determine who the trusting people you know, who do you have to dis-trust, etc.

Even as you read this post and along with others of mine you can even subjectively give me a reputation of a good writer, or not. Someone knowledgeable, or not.

We know and live with the importance of reputation scoring; so how can we apply it to our programming to determine the bad members vs. the good members of our applications? It’s quite easy really, once you break out the Bayesian calculations and at least one positive/negative attribute. So let’s look at how simple the calculation in a Python funtion is; certainly can be used the same in any language.

def probability(badvotes, goodvotes, user_weight=0.5): #Written in Python BTW
    proportion = float(badvotes) / (float(goodvotes) + float(badvotes))
    S = 0.5
    n = goodvotes + badvotes
    return (float(S * user_weight) + (n * proportion)) / (S + n)

The probability function takes the positive/negative scored plus an optional weight adjustment to produce a result that is between ~0.0 and ~1.0; 0.5 being the default neutrality.

prob = float(badvotes) / (float(goodvotes) + float(badvotes))

This line gives us the proportion of bad votes to good votes.

S = 0.5

S in the Bayesian odds calculation weight. In our example the odds are that the writer is neither good or bad at writing. If we set S=0.75 the odds are that the writer is a bad writer, likewise S=0.25 the writer is a good writer.

n = goodvotes + badvotes

Easy enough, get the total votes.

return (float(S * user_weight) + (n * proportion)) / (S + n)

Here’s the heart of the calculation. Return the:
a.). Total odds of offset times the weighting we assigned,
plus
b.) The number of votes made times the proportion of bad votes to good votes
divided by
c.) The odds plus the number of votes

All done and in four lines of code. So let’s see this in action!

You run a content delivery site and are looking for new authors to write for you. You want writers with good articles that deliver on time and make good comments on your blog. Stephen has been recommended by many users of your site. You noticed Peter because he’s written the most articles for you. Here’s the stats for both to use in our calculations.

Peter:
Articles submitted: 100
Good votes: 75
bad votes: 15
Articles on time: 20
Articles late: 80
Comments voted as good: 10
Comments voted as bad: 20

Stephen:
Articles submitted: 200
Good votes: 110
Bad votes: 30
Articles on time: 190
Articles late: 10
Comments voted as good: 50
Comments voted as bad: 70

So here are the probabilities:
Peter:
Being a good writer: 0.1685 (probability(15,75)) < .5 A good writer.
Writing articles on time: 0.7985 (probability(80,20)) > .5 not a good reputation for delivering on time.
Being a good member: 0.6639 (probability(20,10)) > .5 not so good as being a site member
Total overall reputation: 0.5406 (0.1685 + 0.7895 + 0.6639) / 3

Stephen:
Being a good writer: 0.2153 (probability(30,110)) < .5 A good writer.
Writing articles on time: 0.0511 (probability(10,190)) < .5 Actually a really good reputation score.
Being a good member: 0.5829 (probability(70,50)) > .5 not so good as being a site member
Total overall reputation: 0.2830 (0.2153 + 0.0511 + 0.5829) / 3

From these three stats you may be able to make a choice of which writer. Even if you average the probabilities for a total overall reputation, Peter: 0.5406 and Stephen: 0.2830, you would think that Stephen is the best choice since 0.2830 being closer to zero is awesome, and Peter’s score of 0.5406 is slightly above the bad member mark of 0.5. What we really want is a good writer that writes good articles on time. Only site interaction is really just secondary.

So now let’s look at the primary reputation and weight the secondary calculations using the user_weight argument. The two primary considerations will feed the third secondary calculation. Just like a lot of our assessments of other people take other factors into consideration.

So here are the probabilities:
Peter:
Being a good writer: 0.1685 (probability(15,75)) < .5 A good writer.
Writing articles on time: 0.7985 (probability(80,20)) > .5 not a good reputation for delivering on time.
Being a good member: 0.6636 (probability(20,10, ((0.1685+0.7985) / 2))) > .5 not so good as being a site member
Total overall reputation: 0.5435 (0.1685 + 0.7985 + 0.6636) / 3

Stephen:
Being a good writer: 0.2153 (probability(30,110)) < .5 A good writer.
Writing articles on time: 0.0511 (probability(10,190)) < .5 Actually a really good reputation score.
Being a good member: 0.5814 (probability(70,50, ((0.2153 + 0.0511) / 2))) > .5 not so good as being a site member.
Total overall reputation: 0.2826 (0.2153 + 0.0511 + 0.5814) / 3

Remember ~0.0 = Best and ~1.0 = worst. The differences are small but you can see that by finding all your primary considerations for reputation you can apply them to any secondary considerations for a different total outcome. Peter started with a total overall reputation of 0.5406, but with weighting the secondary attribute of comment interactivity lost some reputation, 0.5435. Stephen on the other hand was started as a great writer reputation, 0.2830, and when we didn’t look at the comment interactive as importantly gained further reputation, 0.2826.

So there you have it, building a reputation system is relatively easy. All you need is four lines of code and at least one attribute you can have negative counts and positive counts. Identifing the primary attributes and averaging these to sub-attributes can then be feed to n+ sub-attributes, so on and so forth. The choice of design is completely yours to dtect who your best and worst members are.

Tags: , ,