I would wish to grab data using a stock symbol as the primary driver. The user would enter the stock symbol or symbols to grab pre-defined data. For example the user would instruct the web crawler to mine the historical price and dividend data (this pre-defined data we will call HPD data) for the stock symbol ALSK from the finance site of yahoo.
This part of the program would not be unique - many shareware programs exist that provide this. But I would want to build the database around the HPD data. So I think its worth building our own or using one we are comfortable enough with in terms of licensing and or altering for purposes of our uses.
So step 1 would be to build a robust webcrawler that would grab and dump HPD data into a database. The HPD data fields are as follows: Date, Open, High, Low, Close and Volume. These fields are updated daily based on daily trading of the stock, so initially the HPD data would have to get past data (say for the past 5 years) and then update it daily thereafter. Note dividends would be part of this HPD grab but because they are typically paid quarterly its not clear how the database would demark dividend payments visa via date. There's a need to properly associate all data with actual trading days. We can speak of this in greater details. But for now there should be a way to match the period in which the dividends covers (i.e., the 3 month quarter the dividend covers with the date fields associated with the Open, High, Low, Close and Volume fields).
Using just one stock symbol as a test - ALSK - you would first go to the website [url removed, login to view] then to grab the HPD data
[url removed, login to view]
I'd like to start with this relatively simply and straightforward program and then "add-on" to it with other discrete sets of data fields for the webcrawler to grab and dump into the database as a supplement to the HPD data.