Using Classes in PHP

I’ve been working on an RSS aggregor (written in PHP) recently, similar to “Feed On Feeds”:http://minutillo.com/steve/feedonfeeds/, but with a slight twist. But what I’m doing’s not important for the rest of this post.

Where do you draw the line at too many classes? I’ve been trying to put some thought into the design of the code (not something I usually do) but yet again it seems to have caused more grief that it warrants.

I started by thinking which bits should have their own classes. I began with the design:

bc. Class Feed ($feed_url)
update()
classify()

Then I thought it would make more sense to look at the feeds _in terms of items_. After all, once they’re stored in the database I don’t care about the feed, only the individual items. This led me to:

bc. Class Feed ($feed_url)
update() - which instantiates for each item

bc. Class Item (&$data)
classify()
save()
reclassify()
delete()

Yet this still doesn’t seem a satisfactory design. Not only does it seem like lots more overhead due to PHP having to handle all these classes, but *I really can’t see the benefit over a more general structure* - i.e. just one class that I apply to _all feeds_ in turn (as opposed to instantiating a separate class for each feed), or even (dare I say it) *functions*. Perhaps someone can enlighten me where I’m going wrong. We haven’t touched on OO design yet at University (and thanks to the organisation of my course, I probably never will), so I’m swimming in the dark.

Over to you guys.

4 comments ↓

#1 Harry Fuecks on 09.21.03 at 9:21 pm

Some thoughts;

First don’t worry about the performance issues too much, related to creating objects representing different aspects of the feed. Reckon it’s best to start with a design you’re happy with and deal with optimization later.

A much bigger issue will how to avoid connecting to each (remote) RSS feed over and over again - perhaps best deal with by a cron job.

Might be best to regard the RSS data much like you would a database - you have a parser which you could treat is an equivalent to a database connection.

Something like

// Data access - (implements simple HTTP client)
class FeedConnection($url) {

}

// Iterates over Channels - perhaps place SAX handlers here
class Feed {
// Pass it an instance of FeedConnection
function Feed (& $connection) {
$this->connection = & $connection;
}

// Iterates over the channels, creating Channel objects
function getChannel() {

}
}

class Channel {
function Channel($channelData) {}
// General info about channel - perhaps use multiple methods
function getChannelInfo() {}

// Iterates over the items returning Item objects
function getItem() {}
}

class Item {

}

The tricky part, for performance, will be to only iterate over the data as few times as possible (ideally only once). That’s tough with SAX.

#2 Harry Fuecks on 09.21.03 at 9:36 pm

Some more thoughts;

You probably also want a “container” class for Feed, say class Aggregator.

This might have a method like getFeedByUrl($url) which returns a Feed object, passing it an instance of FeedConnection on instantiation.

Then back to the Feed method, really recommend looking at PEAR::Cache_Lite.

In the feed constructor you might have;

function Feed(& $connection) {
$cache = getCache(); # Global factory function

// If we don’t have a cached version…
if ( !$this->feedData = $cache->get($connection->url()) {
$this->update();
}

}

function update() {
$this->connection->connect();
// Parses the XML and caches the parsed data structure
$this->parse();
}

Something like that. There’s still a problem about _when_ to perform updates - perhaps that’s where the cron job comes in - builds some kind of file which keeps track of the last time a feed was updated. This would should be able to compare with the cached data.

#3 Peter Bowyer on 09.22.03 at 7:49 pm

Hi Harry,

Thanks for your comments. I’d not actually thought much about parsing the RSS - I was hoping to use a prewritten library to do that (although at this rate I may well be writing my own). Hopefully my comments/questions below make sense, but I’ve got something like Flu at the moment, so I’m a bit fuzzy in the head.

I envisage the system being updated by a cron job, but I also want to provide a way of manually updating. The parser will implement the “Last-Modified and ETag handling”:http://fishbowl.pastiche.org/archives/001132.html which should cut bandwidth usage. All the feed items will end up being stored in a database.

In my thinking the parsing of the RSS should be separate from the representation of the feed as items. I was trying to work out the best way to handle these items, such that I could dump the parsed data into the class, call $item['foo']->save() and have it stored in the database.

But, would instantiating a separate object for each item when I extract it from the database be worth it? I know it seems to be the trendy way to do things, so there must be something behind it, but given I’m going to extract all feeds in category X from the database in an array, does it make sense to push each row into its own object, rather than just work on the array?

I’ve lost the thread. I’ll stop writing and come back to this when my head’s a bit clearer. Meantime, if you have any more thoughts/ideas/comments, _do_ post them :-)

#4 Rémy on 10.07.03 at 6:57 pm

Like Harry says, don’t worry about the performance issues.

Harry goes complete OOP (I like that), but looking at it a first time, I think that he went too far with it (I mean too flexible for me). I would put the classes FeedConnection, Feed and Channel in one class. Mostly one connection(url) has only one channel (but Harry has a point, you can’t count on that). I would probably call this combined class Channel, which has to be initiates by a url.

$channel =& new Channel($url);

And then there is the problem of let the class Channel gives back the items as an array or as objects. The items only have to be shown –they have no actions like save\delete\update– so I would probably go for arrays, but object-syntax looks much better to me, so I have to think about this. If they have actions, because you want to store them in a database somehow, I would definitely make objects of them. But maybe Harry has some more thoughts\insights on this.

Maybe I will give it a try this or next week and make one rss-parser by myself (a good practise ;D).

Leave a Comment