Find Jobs
Hire Freelancers

Parsing Text File (Python) (1. Locate Table Based on Keywords, 2. Extract Table Info)

$15-25 USD / hour

Đã hoàn thành
Đã đăng vào gần 8 năm trước

$15-25 USD / hour

I would like to obtain a program that extract a specific table data from text files. Most of the text content is in html, the remaining are not. To achieve that, you need: 1) Locate the table that I want. The table I want is the "Security Ownership For certain beneficial owners". However, the name of the table can change. You will need to write the program to find ("ownership" and "security") or ("ownership" and "stock") to locate the table. The key words ownership, security/stock/securities, beneficial/beneficiary sometimes do not appear in the same row. 2) Extract the table data to csv (preferably using python. You could manually do it as well. There will be about 2000 files if you do it manually) I have attached 5 text files in the attachment as well as the output file. Please see the attachment. The output for the 5 text files are also pasted below: 1st Example [login to view URL] none 2nd Example [login to view URL] none 3rd Example Input Number of Shares of  Shares which may  Common Stock  be Acquired within Percent Name and Address Beneficially Owned 60 Days(1) Owned(1),(2) Genstar Capital LLC(3) 3,534,074 1,335,000 31.5 % Jean-Pierre L. Conte(4) 3,473,407 1,311,000 31.1 % Oxford BioScience Partners IV L.P.(5) 717,293 ? 7.3 % Bio-Rad Laboratories,ےInc.(6) 665,639 ? 6.7 % Gabelli Asset Management Inc.(7) 537,521 ? 5.4 % Terrance J. Bieker 160,498 142,498 1.6 % Kevin J. Reagan 116,832 111,331 1.2 % John L. Zabriskie, Ph.D. 60,500 45,500 * David J. Moffa, Ph.D.(8) 56,350 48,500 * John R. Overturf,ےJr. 43,600 36,000 * Alan I. Edrick 40,916 35,416 * Robert J. Weltman(9) 27,333 24,000 * All directors and executive officers as a group (eight persons)(10) 3,979,436 1,754,245 34.2 % 4th Example [login to view URL] Name and Address of Beneficial Owner Number of Percentage Shares of Class(1) Larry S. Flax(2) 2082053 8.1 %  Richard L. Rosenfield(3) 2118017 8.3 %  Leslie E. Bider(4) 8852 0 %  Marshall S. Geller(5) 16152 0.1 %  Charles G. Phillips(6) 133378 0.5 %  Alan I. Rothenberg(7) 46520 0.2 %  Thomas P. Beck(8) 138750 0.6 %  Susan M. Collyns(9) 463203 1.9 %  Sarah A. Goldsmith-Grover(10) 160211 0.6 %  Steven E. Rich(11) 18794 0.1 %  BlackRock Inc.(12) 1723416 7 %  40 East 52nd Street New York, NY 10022 Fisher Investments(13) 1249015 5.1 %  13100 Skyline Boulevard Woodside, CA 94062-4527 The TCW Group, Inc.(14) 2144619 8.7 %  865 South Figueroa Street Los Angeles, CA 90017 Thompson, Siegel & Walmsley, LLC(15) 1685519 6.9 %  6806 Paragon Place, Suite 300 Richmond, VA 23230 All directors and executive officers as a group (10 persons)(16) 5185930 18.9 % 
Mã dự án: 11086582

Về dự án

35 đề xuất
Dự án từ xa
Hoạt động 8 năm trước

Bạn muốn kiếm tiền?

Lợi ích khi chào giá trên Freelancer

Thiết lập ngân sách và thời gian
Nhận thanh toán cho công việc
Phác thảo đề xuất của bạn
Miễn phí đăng ký và cháo giá cho công việc
Đã trao cho:
Avatar người dùng
Hello Sir, Give me full detail about project. If you need, will show sample now. I am waiting for your message. Hope we can meet here. Thanks.
$25 USD trong 10 ngày
5,0 (130 nhận xét)
6,5
6,5
35 freelancer chào giá trung bình $20 USD/giờ cho công việc này
Avatar người dùng
Hi, I have gone through the files. I am good at Data Entry and Excel. I can make it via Data Entry. I will copy paste the tables. Looking forward to work on this.
$20 USD trong 10 ngày
5,0 (878 nhận xét)
8,0
8,0
Avatar người dùng
A proposal has not yet been provided
$21 USD trong 10 ngày
4,9 (243 nhận xét)
7,9
7,9
Avatar người dùng
Experienced Python Expert FREELANCER HERE to work for your project. Let's discuss more and finalize the project and cost. Feel free to ask me questions, if any. I look forward to work with you. You can also contact me through Skype. Have a good day and stay fine :-) Sincere regards, Jubair
$20 USD trong 10 ngày
4,9 (327 nhận xét)
7,8
7,8
Avatar người dùng
Hello I'm interesting your project very well I'm a Good Python, Scrap, Excel, Math, Algorithm expert. I m quite well experienced in these jobs. Let's go ahead with me I want to service for you continously. Thanks
$21 USD trong 10 ngày
5,0 (39 nhận xét)
7,0
7,0
Avatar người dùng
Hi I have a team of 8 members, expert in web scraping & excel work. I understand the requirements of your project and I can assure you of completion with desired quality of work. I have good skills and experience in ♦ web scraping, ♦ find contact information, ♦ phone , e-mail searching through “GOOGLE OR SOCIAL MEDIA OR GIVEN URL” . I can do this project for you quickly and successfully . I'll work for the lowest price because I want to build a reputation on freelancer.com . Please, give me a chance to show my quality and help me to build a good reputation for my feature jobs. I am a new freelancer but I have long time experience with Microsoft Office (Word, Excel),Data mining , web search etc .
$22 USD trong 10 ngày
5,0 (196 nhận xét)
7,0
7,0
Avatar người dùng
all 5 star reviews for Python projects with years of experience in Python
$15 USD trong 5 ngày
4,9 (61 nhận xét)
5,7
5,7
Avatar người dùng
Hey there... I had a look at your examples and the corresponding output tables...I can do this in Java or C# (not Python!)... Lets agree to a fixed price instead of hourly ? $120 in 3 days...DEAL ?....Please reply.. We can discuss further and hopefully get it started soon... Thank you.. !
$15 USD trong 10 ngày
4,8 (157 nhận xét)
5,8
5,8
Avatar người dùng
Hi, I am a Python developer with proven and extensive experience writing Python scripts used to parse HTML markup with demonstrated quick turnaround. This is normally done as part of web scraping projects using Beautiful Soup library (Python.) My APPROACH I can write Python code that can: 1. Locate relevant table based on given keywords: -- Case 1 (HTML files - 3 & 4): Use paragraphs (elements with tag <p>) to search for keywords (so that search is done on text instead of table rows) Then, locate next sibling table -- Case 2 (Text files - 5): Use elements with tag <PAGE> to search for keywords Then, locate element with tag <TABLE> inside. 2. Extract Table Info: -- Case 1 (HTML file): Regardless of table format or number of columns, there is actually a consistent structure inside each <tr> element (table row) for both columns names, and data rows (ie. same number of <td> elements). Script will exclude non-breaking space (" ") character. -- Case 2 (Text file): Read data rows line by line. 3. Generate Excel sheet with relevant rows as output. Deliverable is a Python script that can be run on schedule or on demand. Hours of work: 8 Hr Project Duration: Max. 3 days Total Cost: 190 USD Look forward to hearing from you. Kind regards, Yordan B
$25 USD trong 25 ngày
4,9 (24 nhận xét)
5,9
5,9
Avatar người dùng
**Fast & Efficient Delivery** Greetings! Hi, I'm computer science graduate with more than 2 years of experience in Application development, I've read all details and also files that you attached here (input & output). I will do this task by first extracting data from files and parse for the required table on search entire html file for each account entry and remove duplication if found, after doing this i will write back that data to the xls. I will do this task in C# Language, that Application Interface can be Desktop or Console Application. Note! I've already worked as parsing document file parsing so it will be easy task for me My Job will speak for itself. Looking forward for consideration
$15 USD trong 15 ngày
4,9 (34 nhận xét)
5,0
5,0
Avatar người dùng
My name is Mike and I’m from UK. I work with individual clients and also provide outsourcing services for a number of UK and USA based agencies. Your project description sounds interesting to me and I do have skills & experience that are required to complete this project. I can show you some examples of my work. Please contact me to discuss your project.
$22 USD trong 10 ngày
5,0 (1 nhận xét)
3,2
3,2
Avatar người dùng
This looks like a fun project. All tables seem to be within some html code. The files do contain some extra text. The extra text seems to not be relevant. The plan would be: 1. Strip extra text using regex 2. Convert html to tree using lxml 3. Use xpath to locate tables 4. Extract table information using xpath 5. Use csv module to write to csv file (one for each processed file) 6. Merge all files into one (if necessary) 7. Convert final file to xlsx (if necessary) Milestone 1: Result for first 100 files Milestone 2: Result for all files All files to be provided. Best, Tammo
$27 USD trong 10 ngày
5,0 (3 nhận xét)
2,8
2,8
Avatar người dùng
Hello, I have read your description very carefully. I am very good at parsing and python scripting. I can deliver the result as per your requirement. Price for whole project (both task included) : 200 USD Lets discuss more over chat. Looking forward to work with you. -Viral Parekh
$16 USD trong 10 ngày
5,0 (2 nhận xét)
2,8
2,8
Avatar người dùng
I can get this done quickly using python Pandas tools. You can count on me to deliver quickly and efficiently.
$22 USD trong 10 ngày
5,0 (1 nhận xét)
2,2
2,2
Avatar người dùng
Happy to help
$15 USD trong 20 ngày
5,0 (2 nhận xét)
2,2
2,2
Avatar người dùng
You pay me after checking the work. Hi I have read out all the details given in your project and I am fully capable to deliver you this project with 100% accuracy. I have completed many projects related to this in the past. Why you do not knock me here for further detail? You can release the milestone after checking the work.
$16 USD trong 10 ngày
5,0 (4 nhận xét)
2,2
2,2
Avatar người dùng
Hello, I'm a recent graduate about to begin a program working in data science. For the past year I have been working extensively in my Python, performing a lot of research analysis. This required me to effectively learn to parse through text files and extract the information I need both quickly and cleanly. Using these skill, combined with some regex, and my familiarity with html, I could finish easily do this job. I look forward to hearing from you, Charlie
$15 USD trong 5 ngày
5,0 (1 nhận xét)
0,4
0,4
Avatar người dùng
I have almost 5 years of experience writing Engineering tools in python. During this time I had to parse many files, so I am well acquainted with your problem. Thanks
$16 USD trong 10 ngày
0,0 (0 nhận xét)
0,0
0,0
Avatar người dùng
i code in python regularly, python is such a great choice for this kind of text processing task, would like to give it a try.
$22 USD trong 10 ngày
0,0 (0 nhận xét)
0,0
0,0
Avatar người dùng
Please see summary about myself. But I am very easy to work with and am detail-oriented. I don't need a lot of guidance, mostly just an outline of what needs to be done. At my last job, which I quit after getting a new job, all I did was write scripts in Python. I did similar things to get accomplish your purpose on this project. For my hourly wage, I just put down what I was getting paid previously for doing this kind of work, and I am more experienced now than before.
$21 USD trong 10 ngày
0,0 (0 nhận xét)
0,0
0,0
Avatar người dùng
I have been working as a Quantitative Researcher in finance industry for 7 years and have done lots of projects like this. For example, I worked on an ETF strategy before and had to scrape those ETF websites, i.e. parse the html file, locate the data table, extract the data and finally store the data in our database. I'm a Python expert and very skillful with libs like pandas, requests, beatifulsoup and so on. I will deliver in a very effective and efficient fashion.
$22 USD trong 10 ngày
0,0 (0 nhận xét)
0,0
0,0

Về khách hàng

Cờ của UNITED STATES
Tempe, United States
5,0
8
Phương thức thanh toán đã xác thực
Thành viên từ thg 9 28, 2012

Xác thực khách hàng

Cảm ơn bạn! Chúng tôi đã gửi email chứa đường link để bạn lấy tín dụng miễn phí.
Đã xảy ra lỗi trong khi gửi email của bạn. Hãy thử lại.
Người Dùng Đã Đăng Ký Tổng Số Việc Đã Đăng
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Đang tải xem trước
Đã cấp quyền truy cập vị trí.
Phiên đăng nhập của bạn đã hết hạn và bạn đã bị đăng xuất. Hãy đăng nhập lại.