1.1k likes | 2.17k Views
Angle Resolved x-ray Photoelectron Spectroscopy, ARXPS – Experience in the Wafer Processing Industry so far. C.R. Brundle, C.R. Brundle & Associates, Soquel, CA G. Conti, Y. Uritsky – DTCL, Applied Materials, Santa Clara, CA - J. Wolstenholme, Thermo Inc . .
E N D
Angle Resolved x-ray Photoelectron Spectroscopy, ARXPS – Experience in the Wafer Processing Industry so far • C.R. Brundle, C.R. Brundle & Associates, Soquel, CA • G. Conti, Y. Uritsky – DTCL, Applied Materials, Santa Clara, CA • - J. Wolstenholme, Thermo Inc. • Our practical experience using ARXPS for determining the following: • 1. Thickness for nominally single overlayer films (0-40Å) – Characterization and Metrology • 2. Composition depth distribution (0-40Å) – Characterization • Note: “Dose” is a sub-set of composition (Metrology?) • Taken as a given that XPS is a powerful technique for elemental and chemical state identification for 0-40Å films. Acknowledgements: - Charles Wang, Ghazal Peydaye-Saheli at Applied Materials
ARXPS – Experience in the Wafer Processing Industry • Will use our experience over a 3-4 year period with 10-30Å Si/O/N gate oxide material, as produced in development by Applied Materials wafer processing tools and processes for Semiconductor Industry customers. • Will refer to a few other necessary “illustrative examples” along the way.
What does this industry want? • THICKNESS • High Precision (better than 1% at 1σ repeatability / reproducibility for a 10Å film). • For Metrology, fast (seconds per point), 5/9 point maps on 300mm wafers. • Accuracy is of less concern. For metrology of no concern. Will be calibrated anyway, and a λeff (“effective attenuation length”). • Would like to be able to distinguish “apparent thickness variations” from what are really materials changes. • → λeff changing with material change • → λeff changing with t
What does this industry want? • DOSE (e.g. N in Si/O/N; As in Si(100)) • 1% precision at 10Å for 1x1015 atoms/cm2. • Accuracy is again of less concern, BUT need to distinguish “apparent dose changes” from depth distribution changes. • DEPTH DISTRIBUTION • A crude distribution is OK (layer model approach?). • BUT it needs to be reproducible and correct. • Would like to be able to detect small variations in a given distribution (e.g. wafer to wafer or point to point on a wafer).
Other Issues • For the Si/O/N work described here we assume flat (low roughness), laterally homogenous (over analysis area) films. We know this to be true. • For Hf based high k work the above is not always true. • All the work is done using the Theta 300Thermo Inc tool. • All the “recipe development” for converting ARXPS data to depth profiling is done by P. Mack at Thermo Inc. We are merely users, though we do have the freedom to vary some parameters. • We often have to make correlations with data from the ReVera tool (Gate group at Applied Materials), which is a single angle only tool designed specifically for metrology (t, N dose) in Si/O/N
Theta Probe – Parallel ARXPS (PARXPS) Theta Probe avoids the disadvantages by collecting all angles in parallel.
The Theta Probe ARXPS Solution Two Dimensional Detector Measures Energy and Angle Simultaneously
Collection Conditions • Angular Range • 20° to 80° • Parallel collection • Up to 96 channels in angle • Generally, 16 angles are used giving an angular resolution of 3.75° • Up to 112 channels in energy • Parallel collection allows rapid ‘snapshot acquisition’ • Excellent for ARXPS maps • Thickness maps • Dose maps
How is Surface Sensitivity Achieved? • Intensity as a function of depth • 65% of signal from <l • 85% from <2l • 95% from <3l • Information depth greater than thickness of gate dielectric l = Inelastic Mean Free Path (0.4 - 4nm)
160000 O1s 140000 120000 100000 Counts / s 80000 60000 Si2p C1s 40000 N1s 20000 0 1100 1000 900 800 700 600 500 400 300 200 100 0 Binding Energy (eV) Typical XPS Full Spectrum For Si/N/O
18 N1s Si4+ Si2p 16 15 14 12 10 10 8 6 18 5 C1s 4 16 2 14 O1s 4000 12 108 106 104 102 100 98 96 406 404 402 400 398 396 394 392 390 Binding Energy (eV) Binding Energy (eV) 10 3000 8 Counts / s 6 2000 4 2 1000 295 290 285 280 275 Binding Energy (eV) 0 540 538 536 534 532 530 528 526 Binding Energy (eV) ARXPS data for each element present
Information depth varies with collection angle • I = I ¥exp(-d/lcosq ) • Spectra from thin films on substrates are affected by the collection angle What is Angle Resolved XPS (ARXPS)? • XPS as a function of the angle, q , (w.r.t. the surface normal) that the photoelectrons leave the surface • A set of measurements over a range of q provides composition information over a range of depths.
ІSi ІSi ІSi ІSi ∞ ∞ 4+ 2 [1-exp (-d/λSi , SiO )] 4+ 0 4+ 0 = exp(-d/λSi , SiO cosθ) 02 [1-exp (-d/λSi , SiO )] R = R∞ 4+ 2 exp(-d/λSi , SiO cosθ) 02 λSiSiO ~λSiSiO (KE’s are nearly the same) So reduces to: ln [1+R/R∞] = d/(λSi, SiO cosθ) 02 4+2 2 Thickness Determination • Based on the classical approach of determining the ratio of overlayer / substrate XPS intensities and using the Beer-Lambert equation and values for λ. • For Si/O/N on Si(100) the overlayer signal is Si4+ and the substrate is Si0. ……… ……… ………
Thickness Determination • Many sources for λSi , Si (and λ values in general – see C. Powell publications) • Classical approach ignores elastic scattering, λe. We know (Powell et al) this can cause significant errors, so that a λeff should be used, and that the errors vary with thickness, so that λeff becomes a function of t. • The effects of elastic scattering get greater at higher θ (more grazing angle), over representing the substrate, leading to a low estimate of t if a fit is made to equation 3 that includes data at high θ (see later). • Our values of λ come from the Thermo Inc algorithm. They are calculated on the basis of formula, density, band gap, and KE. 4+ 4+ ІSiO σSi, SiOλSi, SiO DSiO FSiλSi, SiO ∞ R∞ 2 2 2 2 2 = = = x σSi, SiλSi, Si DSi FSiOλSi, Si ІSi ∞ 2
Thickness Measurement : Testing Model Validity 9.0 nm 6.4 nm • Silicon Dioxide on Silicon • Plot: ln[1+R/ R ¥] vs. 1/cos(q) • Fitting: Fit through the origin • Gradient: = d/l • NOMINAL THICKNESS VALUES FROM ELLIPSOMETRY 4.3 nm 3.6 nm 2.3 nm 1.9 nm
Comparison of XPS Results To Ellipsometry • SiO2 on Si • Excellent linearity • Ellipsometry included C layer in thickness • The offset will change as a function of time as more contamination is picked up Ellipsometry included C in layer thickness
Thickness • data considered for Si/O/N on Si(100) • 1) 8 sample set with t ~ 10-30Å • N% age ~ 7-30% • - 4 from process A; 4 from process B • - Determine d, N dose, and Max. Ent. Derived depth profile • - Only one set of experimental data, but evolving treatment over a 3 year period. • Note: very large t and N% range– not typical for metrology
Thickness: • Quality of Data? • Manual Fits - Operator Influence? • - Repeatability by single operator? • Effect of changing composition (N% age), which is large here? • Effects of angular range used? • - Depends on thickness, material • - Consequence for single angle determination? • Effect of composition variation with depth? • → Automated 3-layer model (p. Mack, Thermo) • - No operator dependency • - Completely reproducible • - Iterative fit to 3-layer depth distribution model and t (i.e. value of N dose and it’s distribution effect, t)
Single Overlayer Model for Film Thickness: quality of data?; manual fits? A-11 There is ambiguity in assigning intensity between the Si4+ and Si0 peaks
XPS Measurements of SiO2 Thickness: Effect of angular range included? • Comparison of ARXPS with fixed angle XPS • Good agreement except at large thickness • Single angle measurement samples large angular range. • ARXPS measurements • Effect of angular range upon measured thickness • Minimum angle is 23° in all cases • Highest usable maximum angle depends upon oxide thickness J. Wolstenholme, Thermo, Inc.
Thickness • 8 SAMPLE SETof Si/O/N– One set of experimental data, but how it has been processed has changed from 2003 to 2007. Note: Very large t and N% range
Thickness Conclusions • Precision of data is no problem • Validity of model should be tested (ie use angular data and fit to equation. not just a single angle determination) • For Si4+ (overlayer) / Si° (substrate) fit to data, operator dependence for manual fit can be a problem • Automated fit (3 layer model) can be completely reproducible • Relative accuracy depends on validity of parameters input – λ(f(t)?), density (f(N%age)), depth distribution (f(N%age)?) • (e.g. 14.1Å for a 8.5%N film going to 20.1Å for a 23.7%N film, found using the manual non-iterative model, is a very different %age change compared to 13.8Å going to 16.8A in the 3 layer model)
N Dose • So far have been only listing “N%age”; i.e. the usual XPS approach of peak intensities corrected for photoionization cross-section. This assumes homogenous composition. • Dose is the total amount of N in the film. • If uniform distribution Dose = N%.t.C • If non-uniform, N%.t.C becomes an “Apparent Dose” • - The “Apparent Dose” can be greater or less than true dose, depending on depth distribution • - ReVera single angle approach? • ∙ Initially – assumed a depth distribution??? • ∙ Now – determines a depth distribution from a Tougaard background approach. • Theta 300/Thermo : N dose by integrating N depth profile distribution from (a) Full Max Ent approach or • (b) 3-layer model (automated).
So, we need to know N distribution to get true N dose True dose < Apparent Dose CN CN d d Effect of Distribution on Dose Calculation CN = N Concentration d = depth • True dose >Apparent Dose
N Dose (x e15 atoms/cm2): 8 sample set (from integrating N depth distribution; discussed later)
N Dose Conclusions • NEED CALIBRATION/VERIFICATION BY MEIS! • Striking agreement between 3-layer model and the June 2003 Max Ent results, except for very high N content (even though large differences in estimated t!). • June 2003 – About 8% spread from pure N%·t approach. • 3-layer – About 15% spread from pure N%·t approach, but linear • Limiting angular range (66°-55°) produces up to 10% variation (because Max Ent derived depth profile is different). • Note: very large dose variations are being considered here. Not usual for metrology.
Ultra-Thin Film Depth Profiling by ARXPS Status • Because of the short mean free path lengths, λ, of the photoelectrons generated and used in XPS, non-destructive depth profiling is limited in the depth it can effectively go to • 65% from < 1 λ; 85% from < 2 λ; 95% from 3 λ • λ ranges from 0.5nm to 4nm (material and electron energy dependant) • How limited depends on level of detail wanted • ARXPS quite capable of detecting a substrate > 3 λ down, but not profiling the 3 λ overlayer or giving a precise thickness • Detailed profiling possible up to ~ 2 λ thickness • Reliability of profile obtained by ARXPS? • Relative Depth Plot, RDP - QUALITATIVE but simple, fast, model independent • Maximum Entropy Method - QUANTITATIVE, but modelled and requires experience or a ‘recipe’
Processing the data– RDP A relative depth index can be calculated using: An indication of the layer order can then be achieved by plotting out the relative depth index for each species. ln{ }= RDP ratio Peak Area (Surface) Peak Area (Bulk)
Construction: Collect ARXPS spectra For each element, calculate: C HfO2/Al2O3 SiO2 Si le C1s Surface Al 2p O1s (Low BE) Hf 4f O 1s (High BE) Si 2p (Ox) RDP Bulk Si 2p (El) RDP • Information • Reveals the ordering of the chemical species
ALD TaN Film: chemical state RDP Angle Resolved Spectra from TaN Sample TaOx TaNt
Relative Depth Profile, RPD • Advantages • Fast • Model independent, no assumptions • Limitation • No depth scale • No concentration profile structures • In my opinion an RDP is the most generally useful approach in ARXPS for characterization of unknown film structures seen during process development.
60 50 40 30 Atomic percent (%) 20 10 0 80 70 60 50 40 30 Angle (°) Max. Ent. : Depth Profile Generation Sample Generate Random Profile C Al2O3 SiO2 Si Calculate Expected ARXPS Data (Beer Lambert Law) O Si4+ Tj(q) = exp(-t/lcosq) C Sio Al Surface sensitive More bulk sensitive
Depth Profile Generation (cont.) • The MaxEnt solution is derived by minimising 2 while maximising the entropy • Maximise the joint probability function • Repeat process to obtain most likely profile Determine error between observed and calculated data: • Calculate the entropy associated with a particular profile (the probability of finding the sample in that particular state) • cj,i is the concentration of element i in layer j
Reliability of Max Ent Modeling • Simple model fit to the data can never be unique! The Max Ent approach (balance with Entropy) is a “regularization” approach. Detail of results are nearly always over-interpreted. • Balance of2 and is operator (or recipe) chosen • Requires experience with sample at that thickness • Requires assumptions about ‘unrealistic solutions’ • e.g. Too spiky a distribution? 2weighting too high (or too small) • e.g. Too smooth, substrate never reaches 100%, film elements never go to 0%? too big • For a ‘simple’ film of < 2λ; with good statistics data; a substrate with no species common to the film; zero or small surface contamination • Develop reliable recipe (2, , …verification?) • Possible to obtain a reliable profile for system appropriate to that ‘recipe’ (see examples following) • Is it for Si/O/N with t, N dose variations?
100 HfSiON 80 Si Si0 60 At % 40 O N 20 0 0 1 2 3 4 5 Si4+ Depth / nm Hf HfSiON Reconstructed Profile
Comparison of ARXPS with MEIS N Hf Si0 Total Si (MEIS) O Si4+
C1s (O) O1s C1s SAM S2p Ag Ag3d TiW Quartz PEO-thiol SAM on Silver SAM = -S-(CH2)11-(O-CH2-CH2-)3-OH Depth Profile Relative Depth Plot Ag 3d C 1s (H/C) C 1s (Ether) O 1s S 2p
Example of Max Ent Derived Depth Profile on an Ultra-Thin Si/O/NFilm • Reliability? • Need high quality angular data – good S/N • Need “constraints” and a “recipe” for term
Effect of Depth Distribution on Peak Intensity Ratios Extreme Example: answer qualitatively obvious from raw data or RDP, but cannot know whether detailed Max Ent distribution is valid without verification/calibration by some other method.
Repeatability of ARXPS Concentration Profiles • Three ARXPS datasets acquired dynamically from point on a Si oxynitride sample (sample repositioned each time). • Concentration profiles reconstructed from each dataset • Good reproducibility of reconstructed profiles.
June 2003. Max. Ent. α=2e-4 Process A Process 11A t = 19.8Å N% = 6.7% N Dose = 8.57 x 1014 atoms/cm2 Process 1A t = 14.1Å N% = 8.5% N Dose = 7.61 x 1014 atoms/cm2 Process 3A t = 16.3Å N% = 16% N Dose = 1.70 x 1015 atoms/cm2 Process B Process 15B t = 10.4Å N= 9.6% N Dose = 6.33 x 1014 atoms/cm2 Process 3B t = 11.2Å N= 12.1% N Dose = 8.41 x 1014 atoms/cm2 Process 13B t = 14.2Å N= 18.6% N Dose = 1.72 x 1015 atoms/cm2
Si0 Si/O/N SiO2 0 110 106 102 98 94 Binding Energy (eV) Example of Chemical Depth Profiling, June 2003: Distinction of Si-O Using Si Chemical Shifts • Different Si 2p binding energy for Si4+ in SiO2 and Si/O/N allows separation in profile • t = 21.1 Å N = 29.8% SiO4 SiO3N Film is actually more like this: Post Oxidation? SET 6B Graded Region SiO2 Si/O/N Si (100) SET 10A t = 20.1Å N = 23.7%
Slot 15 Slot 10 Slot 6 Slot 3 Slot 3* Slot 1 Slot 11 Slot 13 Interface Interface Normalized Overlays of N Distribution, June 2003 • Set A and set B are very similar (not expected) • N distribution does not change much with N total dose • Hard to get more than 10% N absolute at surface (air oxidation and HC pickup will reduce N content) • No evidence for a nitrogen spike at the surface, cf.TOF SIMS. • (this was the original reason for studying these sets of samples) Set 1 Set 2