1 / 10

Android application for pictures/videos voice tagging

Technion - Israel Institute of Technology COMPUTER SCIENCE DEPARTMENT Industrial Project (234313). Android application for pictures/videos voice tagging. Students: Yevgeni Sabin, Vladimir Rudenko Supervisors: Nadav Golbandi, Oren Somekh. Motivation.

lane
Download Presentation

Android application for pictures/videos voice tagging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technion - Israel Institute of Technology COMPUTER SCIENCE DEPARTMENT Industrial Project (234313) Android application for pictures/videos voice tagging Students: Yevgeni Sabin, Vladimir Rudenko Supervisors: Nadav Golbandi, Oren Somekh

  2. Motivation • Picture and video sharing above internet is very popular today. • Users wants to tag their pictures for classification/retrieval purposes. • Many of those pictures are taken by mobile devices such as smartphones. • Nowadays in order to tag the picture, user have to type the name/tag on its phone’s keyboard. • The goal of our project is to simplify the process of taking the picture, tagging it and uploading it to the Internet by making it a “one clicks operation”.

  3. Objectives • Make an Android smartphone able to record voice tags and add it to a picture. • Adding voice to the jpeg is done in a seamless way such that it can be still handled by standard jpeg tools (e.g., galleries) • Make an Android smartphone able to manage voice tags by adding, editing or deleting them using a picture browser.

  4. Objectives • Make an Android smartphone able to upload their voice tagged pictures to external web server. • Currently we use Flickr as picture hosting server using Flickr API, which allows user to work with existing and popular web service. • Ensures secured connection to web service. • After uploading the voice tag enhanced picture, the application will be able to receive a feedback from the server that will include the extracted text tags.

  5. Methodology • For achieving these objectives two standalone applications were developed: • TuCo Camera – camera application that allows voice tagging and uploading pictures in addition to standard operations. • TuCo Gallery – gallery application that allows voice tagging and uploading pictures in addition to standard operations. • Both applications were developed from scratch. • Separate development gives the user the opportunity to use only one of the applications in pair with the third party application. (e.g., TuCo Gallery + standard camera) .

  6. Methodology

  7. Image and audio encapsulation • Voice tagging application allows to record up to 15 sec of voice and insert the voice data directly to JPEG file w/o affecting the image data. • The audio file split into chunks of 64K. Each chunk is pushed into one “Application block”. We use App. 3 to App. 13 (they are available according to JPEG specification). • Audio is stored in PCM 16 kHz/16 bit format .

  8. Image and audio encapsulation • Voice data layout • Header (128 byte) – includes various information such as: voice block size, upload status, text tags. • WAV Header (44 byte) – includes voice parameters in wav format. • PCM raw data (up to ~600k) – raw voice data.

  9. System architecture Insert/extract voice from picture Upload picture to server Play/Record audio Shows all pictures in gallery Shows single picture full screen Shows single picture full screen Shows camera view

  10. Future development • Add voice encoding to decrease voice data size • Concurrent multiple pictures uploading • Integration with other photo web services (such as Picasa and Panoramio) • GUI and UI improvement • Porting to other mobile devices (such as iPhone and Windows Mobile)

More Related