Sinusoidal_model Knowpia

In statistics, signal processing, and time series analysis, a sinusoidal model is used to approximate a sequence Y_i to a sine function:

Y_{i}=C+\alpha \sin(\omega T_{i}+\phi )+E_{i}

where C is constant defining a mean level, α is an amplitude for the sine, ω is the angular frequency, T_i is a time variable, φ is the phase-shift, and E_i is the error sequence.

This sinusoidal model can be fit using nonlinear least squares; to obtain a good fit, routines may require good starting values for the unknown parameters. Fitting a model with a single sinusoid is a special case of spectral density estimation and least-squares spectral analysis.

Good starting values edit

Good starting value for the mean edit

A good starting value for C can be obtained by calculating the mean of the data. If the data show a trend, i.e., the assumption of constant location is violated, one can replace C with a linear or quadratic least squares fit. That is, the model becomes

Y_{i}=(B_{0}+B_{1}T_{i})+\alpha \sin(2\pi \omega T_{i}+\phi )+E_{i}

or

Y_{i}=(B_{0}+B_{1}T_{i}+B_{2}T_{i}^{2})+\alpha \sin(2\pi \omega T_{i}+\phi )+E_{i}

Good starting value for frequency edit

The starting value for the frequency can be obtained from the dominant frequency in a periodogram. A complex demodulation phase plot can be used to refine this initial estimate for the frequency.^{[citation needed]}

Good starting values for amplitude edit

The root mean square of the detrended data can be scaled by the square root of two to obtain an estimate of the sinusoid amplitude. A complex demodulation amplitude plot can be used to find a good starting value for the amplitude. In addition, this plot can indicate whether or not the amplitude is constant over the entire range of the data or if it varies. If the plot is essentially flat, i.e., zero slope, then it is reasonable to assume a constant amplitude in the non-linear model. However, if the slope varies over the range of the plot, one may need to adjust the model to be:

Y_{i}=C+(B_{0}+B_{1}T_{i})\sin(2\pi \omega T_{i}+\phi )+E_{i}

That is, one may replace α with a function of time. A linear fit is specified in the model above, but this can be replaced with a more elaborate function if needed.

Model validation edit

As with any statistical model, the fit should be subjected to graphical and quantitative techniques of model validation. For example, a run sequence plot to check for significant shifts in location, scale, start-up effects and outliers. A lag plot can be used to verify the residuals are independent. The outliers also appear in the lag plot, and a histogram and normal probability plot to check for skewness or other non-normality in the residuals.

Extensions edit

A different method consists in transforming the non-linear regression to a linear regression thanks to a convenient integral equation. Then, there is no need for initial guess and no need for iterative process : the fitting is directly obtained.^[1]