320 likes | 510 Views
TARSU- Persone Fisiche *. Italian fiscal data entry system for waste management taxation francesco.senia@loquendo.com. (*) eng. TARSU-Physical People TARSU = Tassa Raccolta Rifiuti Solidi Urbani. Background. The current taxes office inherited their databases on taxpayers from other offices
E N D
TARSU-Persone Fisiche * Italian fiscal data entry system for waste management taxationfrancesco.senia@loquendo.com (*) eng. TARSU-Physical People TARSU = Tassa Raccolta Rifiuti Solidi Urbani
Background • The current taxes office inherited their databases on taxpayers from other offices • They are affected by errors or missing inf. • Commune of Rome is willing to purify them • Staggered mailing to targeted sections of their taxpayers • The taxes office is using an existing service, developed by Loquendo, for business premises • It is a DTMF system as the access key is the VAT registration number (only digits)
Fiscal data entry system • Loquendo has developed a new service, voice-operated for private citizens with business not recorded in the public VAT registry (access key: Fiscal code - 16 char alphanumeric code) • It uses Loquendo vocal technology (TTS and Speech Recognition) • Customers, identified by their own ID (sent them by Rome officer), enter their “fiscal card data” and their “business VAT code”.
9989020 ROSSI MARIO MI COLOGNO MONZESE 10 MAY 1963 M RSSMRA63E10C895A 7310G Speech Form Used • customer ID • fiscal card data • surname • forename • city & prov. of birth • date of birth • sex • fiscal code • business VAT code (8 voice steps)
Fiscal Card Data Fiscal Code Surname Forename Sex Birth Place Province of Birth (code) Birth Date
Rossi Mario 10th 1963 Cologno Monzese May born on the of in R S S M R A 6 3 1 0 C 8 9 5 E A f a b c d e g Example of Fiscal Code Rules are in the next slide
Fiscal Code Rules a) The first three characters are made up of the first three consonants of the surname. If the surname contains less than three consonants the remainder is made up using the first n vowels. b) As for a) using the first name. c) The next two characters are based on the year of birth – 1963. d) This is the code for the month of birth, in this case May. e) This is the day of the month on which the individual was born. For males, this figure ranges from 1-31. For females, 40 is added to the day of the month so that the range is 41-71. In this way gender can be distinguished. f) This is the code for the city of birth, in this case Cologno Monzese. The city code list is maintained by Finance Ministry. g) This is a checksum character.
Business VAT code • It is an alphanumerical code, designed by Finance Ministry to identify univocally all business typologies (e.g. pizzeria, university, hospital, shoes maker, ...). • Currently more that 1232 codes have been defined. • They are variable in length and composition: • 4/5 digits or • 4 digits + 1 letter
Business VAT codes examples 4067 PRODOTTI NON ALIMENTARI, NON ALTROVE CLASSIFICABILI 9262C ALTRE ATTIVITA' PROFESSIONALI SPORTIVE INDIPENDENTI 26620 FABBRICAZIONE DI PRODOTTI IN GESSO PER L'EDILIZIA 31621 FABBRICAZIONE DI ALTRI APPARECCHI ELETTRICI N.C.A. 52626 COMM. AMBULANTE A POSTEGGIO FISSO ART. DI OCCASIONE 7310G RICERCA E SVIL. SPERIM. NELLE SCIENZE NATURALI E INGEGNERIA 9262A ATTIVITA' PROFESSIONALI SPORTIVE SVOLTE DA ATLETI
The service structure START SERVICE CONN. DIGIT GRAMMAR • GET ID CODE (voice or DTMF) • ACCESS DB to GET USER IDENTITY • PROMPT USER IDENTITY • GET IDENTITY CONFIRMATION YES/NO GRAMMAR • GET ITALIAN NATIONAL. CONF • GET GENDER DATE GRAMMAR • GET BIRTHDAY ITALY DATABASE • GET PLACE OF BIRTH FISCAL CODE GRAMMAR • GET FISCAL CODE • GET BUSINESS VAT CODE BUS. VAT CODE GRAMMAR • RECORD VOCAL SIGNATURE END SERVICE
Customer identification • TARSU-PF works only for registered users. They receive, by surface mail, their unique identification number from the Rome Municipality. • Upon entering the 7-digit ID code, the system replies with customer “name + surname” as recorded in the database. • If the customer DISCONFERM it, the call is transferred to the operator. CONN. DIGIT GRAMMAR • GET ID CODE (voice or DTMF) YES/NO GRAMMAR • ACCESS DB to GET USER IDENTITY • PROMPT USER IDENTITY • GET IDENTITY CONFIRMATION
GET ITALIAN NATIONAL. CONF YES/NO GRAMMAR • GET GENDER Language and Sex identification • The TARSU-PF speech recogniser is designed only for Italians. In order to discard foreign people, the system asks for user nationality “Are you born in Italy?” • The next step the system asks for customer gender, with another Yes/No question “Are you a male?”
Birthday recognition(main task) • The system asks to enter the customer birthday, that can be uttered as a whole sentence, using different styles: • 10 Maggio 1963 • 10 5 63 • 10 05 1963 • 10 05 63 this is the fiscal card style • After the speech recognition, the system asks for birthday confirmation DATEGRAMMAR • GET BIRTHDAY
Birthday recognition(redo task) • In the case of error, the system switches to a block mode, asking day, month and year as separate entries. The previous grammar is re-used and each piece of information can be entered using the same different styles. • Each piece of information is explicitly confirmed • GET DAY OF THE MONTH DATEGRAMMAR • GET MONTH • GET YEAR
ITALYDATABASE • GET PLACE OD BIRTH Place of birth collection(DB driven speech recognition) • This task uses a DB driven approach to speech recognition • The relational database contains about 20.000 records to correctly identify the 13.600 city names defined since 1861 • It uses 2 vocabularies derived from the relational database:cities names (+ alias) & district+region (+ alias). Tree steps 1: city acquisition 2 (opt.): in the case of more records selected, the system asks for the additional district or region name 3: in the case of still more records, the system enter in a confirmation loop and the selected list is browsed: than users select the right record.
ALIAS_COM_ID ALIAS_REG_ID PROV_ID REG_ID COM_ID PROV_ID COM_ID NAZIONALE NOME_COMUNE PROV_ID COM_ID REG_ID NOME_REGIONE NOME_PROVINCIA REG_ID ALIAS_NOMI_COMUNI ER model of the Italian administrative database NOME_COM_ID ALIAS_COM_ID 1:n 0:n NOMI COMUNI NOME_COM_ID ALIAS_COM 1:n NOME_COM_ID 1:1 COMUNI 1:1 1:1 FLAG_CAPOLUOGO PROV_ID 1:n ALIAS_PROV_ID COM_ID PROVINCE 1:n 0:n ALIAS_PROVINCE ALIAS_PROV_ID ALIAS_PROV REG_ID 1:1 1:n 1:n 1:n ALIAS_REG_ID 0:n REGIONI ALIAS_REGIONI ALIAS_REG
FISCAL CODE GRAMMAR • GET FISCAL CODE R S S M R A 6 3 1 0 C 8 9 5 Rossi Mario 10th 1963 Cologno Monzese May born on the of in E A city code surname name year month day checksum Fiscal code recognition • The recogniser also accepts numbers in groups (e.g. seventy-three) as well as char by char. • The recognition task uses the checksum to prompt user with the best correct string, first in the list, for confirmation
BUS. VAT CODE GRAMMAR • GET BUSINESS VAT CODE Business VAT code recognition • The business VAT code is collected with a three steps procedure: 1) it gets code length (4 or 5 characters) 2) in the case of 5 characters, the system asks if the last char is a digit or a letter 3) it gets the numerical part 4) it get last char, in the case of letter. • This approach is long because it is the same already used in the existing system that accept also DTMF. In this case we preferred not to alter the original structure.
RECORD VOCAL SIGNATURE END SERVICE Vocal signature • The last step records a “vocal signature” from the caller. • This is used to certify the collected information. • User are requested to record at least their telephone number to be reached by the municipality personnel in the case of errors. • This is a “pure” recording task, no speech recognition checks are performed
Trial Design • 100 subjects recruited by a marketing firm company • Subjects rewarded with a 30.000 lire voucher • Experiment run during 5th – 13th Dec 2000 • Priming letter prepared by CSELT/CRoma • Background questionnaire prepared by CSELT • Likert usability questionnaire prepared by CCIR/CSELT
Subjects data/tasks • Each participant received/performed • One priming letter • One background questionnaire • Two telephone calls • Two different tasks: fictitious and own details • Two Likert usability questionnaires
Dimensions of the subjects space • 100 participants • 4 dimensional space • 2 genders • 3 age groups • 2 ID code input modes • 2 first call orders • 2 x 3 x 2 x 2 = 24 different groups selected (average 4 participants each group)
INPUT MODE ORDER GENDER Age group 1 Age group 2 Age group 3 Speech only Fictitious 1st Female 3 7 3 Male 4 6 3 Own 1st Female 4 6 2 Male 4 7 1* DTMF enabled Fictitious 1st Female 3 7 1* Male 4 6 2 Own 1st Female 4 7 2 Male 5 6 3 Selection of valid data (*) participants removed from the analysis
Trial data analisys • Callers made several mistakes, mainly if elderly and when using fictitious data. Ex: • DTMF instead of Voice, or vice-versa, for the ID • Fiscal Code instead of ID Code • extraneous words: e.g. “My ID code is …” • bud spelling: exchanged ‘J’ / ‘Y’, ‘O’ / ‘0’ • day of the month uttered as whole date • too low speaking rate during the FC stage • Fiscal code spelled using city names • Safe processing • No wrong data accepted at all by the system.
Attitude by Gender(own details) 7 6 5 4 negative neutral positive 3 Female N=48, Mean=5.11 Male N=50, Mean=5.28 2 1 Polite Reliable Friendly Efficient Flustered Use again Confusing Needs improvement Frustrating Liked voice Ease of use Voice clarity Complicated Under stress Prefer human Concentration Enjoyed using Speed of service Knew what to do Degree of control
< 25 N=31, Mean=5.53 26-60 N=52, Mean=5.09 > 60 N=15, Mean=4.90 Attitude by Age Group(own details) 7 6 5 4 negative neutral positive 3 2 1 Polite Reliable Friendly Efficient Flustered Use again Confusing Needs improvement Frustrating Liked voice Ease of use Voice clarity Complicated Under stress Prefer human Concentration Enjoyed using Speed of service Knew what to do Degree of control
False Reject 0.0% INSIDE VOC. Substitutions 9.7% Correct Recog 90.3% 0% 20% 40% 60% 80% 100% % utts Fiscal Code Recognition(Constrained)
Recognition Performance(Date) 100% 80% 60% 40% 20% 0% ERR % 86.3 % 13.8 % Full Date day month year 72.7 % Distrib. Errors 5.6 % 94.4 % Day only 100 % Month only 84.2 % 15.8 % Year only CORR % 0% 20% 40% 60% 80% 100%
Recognition Performance(City/Province of birth) 100% 80% 60% 40% 20% 0% ERR % 93.7 % 6.3 % City 90.9 % 9.1 % Province CORR % 0% 20% 40% 60% 80% 100%
Recognition Performance(other tasks) • Confirmations resulted practically in100 % correct recognition (the main problem was a 10.4% of TOO EARLY warnings from EPD module, resolved with the repetition of the question) • Business VAT code gave about 100 % (both the numerical part than the final letter when present).
Summary The TARSU-PF service shows that complex fiscal data can really get by voice from the citizens, nevertheless • Improvements are necessary to give more context dependent feedback • Date grammar must be updated, reducing the years space, to get benefits from the service characteristics • Fiscal Code speaking rate from real customers must be measured and used to properly setup the EPD
This was a presentation francesco.senia@loquendo.com