Trying to understand the ebay scheme
I want to build an ebay-like site (its a mini version using the LAMP stack as a base setup in my opinion, or maybe you guys suggest something else) and I'm wondering how they built their system. Most of all I don't understand how they manage their categories. They have one search code and possibly one code for listing items for sale as well as one code for displaying items. But how do they create / store a template for each category? Also what is the structure of the database behind their setup? And finally, they have so many categories and subcategories, have someone submit the item inside (which is most likely the ebay process used to add categories) Engines → Parts and Accessories> Racing Parts
A few days later, people are requesting additional subcategories under Racing Parts:
- accessories
- Auto racing.
- Fasteners, fluids and gaskets
- Kart racing parts
- Safety engineering
- Other
So now they have a new level for the Racing parts, which goes like this:
- Engines → Parts and Accessories> Racing Parts> Accessories
- Engines → Parts and Accessories> Racing Parts> Fasteners, etc ..
What happens to existing listings that were published before new subcategories were added? do they move to a subcategory? Does ebay include new items to be listed in subcategories and remove their old submission form for "racing parts"? If they do, what if the user is missing a category, the user might get confused and not post, and then ebay will lose money. And if they don't remove the Generic Racing Parts posting form, users will post to a category that is too generic and now it becomes difficult to use the Refine Search option because all the forms have different fields that ebay can filter through.
If you have any ideas please let me know. I am really confused about how they do it and really would like to understand :)
source to share
Here's Randy Shoupe on Ebay Architecture
He mainly talks about scalability, availability, manageability, etc. A schematic is something that you have to design yourself based on your specific requirements. Slides
From his chat transcript
"Is it even a relational database or is it really different?"
This is completely different. This is a search engine like Google or Yahoo! developed by the same people who developed the search engine AltaVista, and like many search engines, it was developed along the same lines, which is an inverted index. There is a set of documents with identifiers, keywords are indexed into these documents, and query operations are performed by intersecting lists or vectors of these keywords, very simple, and there are many more details about how it works. Challenge - As an aside, the challenge for an eBay-style search engine is that our users expect the search engine to be updated in near real-time. When someone places a bid on an item that changes the price, and the price is a filter, the query is very interested. Thus,what it really means is that the style is the classic web search engine style "you create the index as a batch and then load it into the search engine" - which doesn't really work for us. It should be much more in real time. So I'll go into a little bit about how this real-time system works in my asynchronization section, but anyway, to finish thinking about scalability for search, the idea is that the search engine can be split horizontally. So there is this general search index of any size. We divide it into pieces ten or twenty or sixty or one hundred, and thus we divide the infrastructure. And then we have an aggregator that now scatters / collects all these different parts of the index. So someone asks for "iPod"or "Mickey Mouse" or "Wii" and the aggregator sends a query to each of the different sections or shards, returns the results and aggregates them and sends them to the user. "
source to share
A simple construct I could think of is to have one database relationship for storing categories and another for category-related items.
The first relation to the database will store the categories in the parent child mod, where it is quite easy to add more categories or subcategories. the second element relation will have a foreign key relation to one or more elements in the first relation reflecting one or more categories to which the element belongs.
source to share