## Hyun Sik Sim*## |

Process and processing equipment | |||||||||
---|---|---|---|---|---|---|---|---|---|

p10 | p20 | p30 | p40 | p50 | p60 | p70 | p80 | p90 | p100 |

f11 | f21 | f31 | f41 | f51 | f61 | f71 | f81 | f91 | f101 |

f12 | f22 | f32 | f42 | f52 | f62 | f72 | f82 | f92 | f102 |

f13 | - | f33 | f43 | f53 | f63 | f73 | f83 | f93 | - |

- | - | f34 | f44 | - | - | f74 | - | - | - |

p30 (Cu Plating), p50 (Exposure), p60 (Develop), p70 (Plating), p90 (Strip).

The linear regression model is applied when analyzing the correlations between one dependent variable and several independent variables. In general, a dependent variable in the linear regression model is assumed to be a factor that has consecutive values. However, if “good” or “faulty” is determined by an independent variable, or when the analysis involves a binomial event, where the dependent variable is either “0” or “1,” the correlations between dependent and independent variables cannot be explained sufficiently by conventional linear regression analysis [8]. For such cases, the model must be restructured to explain the probability rather than the quantity of the dependent variable. In general, a regression analysis model can be expressed as a linear equation relating the independent variables ([TeX:] $$x_{1}, x_{2}, \cdots, x_{k}$$) and the dependent variable ([TeX:] $$y$$), as follows:

However, to handle a binary dependent variable, such as the occurrence or nonoccurrence of a defect, y must have a probability value (P) between 0 and 1. In the linear regression model in Eq. (1), both the left and right sides can have the same range of (-, +). If P is substituted for y, the ranges of the dependent and independent variables may not coincide with each other. To solve this problem, a linear regression model with a uniform range can be created through a logit transformation using the logistic function in Eq. (2). Besides logit transformation, other models such as those by Gompertz and Probit may be used [9].

The logistic function in Eq. (2) is a linear equation based on the odds concept, which uses a ratio of the probability of occurrence to that of non-occurrence [10], as follows:

Odds is a relative measure of event occurrence, and the odds ratio is the degree of change in the odds of the dependent variable when the independent variable increases by one unit. In conventional linear regression analysis, the coefficient β is estimated by minimizing the sum of squares of the residuals (i.e., the least squares method). However, in logistic regression, the coefficient is estimated using the maximum likelihood method that maximizes the probability of the occurring event [7].

In this study, logistic regression was used to achieve a high yield in the PBGA manufacturing process. Thus, the existence or absence of defects was used as the dependent variable. Considering the independent variable, the process equipment was selected and converted to a discrete variable that had the value “1” if the lot passed through the equipment and “0” otherwise. Next, the correlation between the dependent and independent variables was measured. In the logistic regression model of the binary variable, the defect probability (P) values larger than 0.5 were classified as group “1” (Fault) and those smaller than 0.5 as group “0” (Good). Here, the P value sets a classification reference value higher than the mean (0.5), so that the data can be classified sufficiently into group “1”. Furthermore, the stepwise selection method based on the last estimated logistic regression model was used to discover the key equipment factors that affect the defects. The stepwise selection method successively adds factors that have large effects on the dependent variable. Whenever a new factor is added, an existing factor is deleted, or when a factor is deleted, a step-by-step review is conducted to determine whether the importance of an already deleted factor has increased and can be added back. In this experiment, the factor that had the largest chi-square value with the P value below 0.05 was selected first.

In this study, the optimal model equation was statistically created according to the aforementioned model. The factors that had large chi-square values with P values lower than 0.05 were selected using a significance test for each independent variable. Here, the chi-square value indicates the degree of effect of an independent variable (process or equipment) on the dependent variable (defect rate).

Table 2 lists the processes that affect F2. As shown in Table 2, Exposure (p50), Develop (p60), and Strip (p90) processes were selected as the key processes that have large effects on F2 in descending order of the level of effect. In this study, the cumulative influence of each process was calculated, and three processes that accounted for approximately 70% of the total influence were selected (the P values of p80 and p100 were greater than 0.05).

Table 2.

Process | Number of equipment | [TeX:] $$X^2$$ | P-value | Impact (%) | Cumulative impact (%) |
---|---|---|---|---|---|

p50 | 3 | 151.2 | <0.0001 | 29.1 | 29.1 |

p60 | 3 | 103.8 | <0.0001 | 20.0 | 49.1 |

p90 | 3 | 100.3 | <0.0001 | 19.3 | 68.4 |

p40 | 4 | 48.7 | <0.0001 | 9.4 | 77.8 |

p30 | 5 | 37.9 | <0.0001 | 7.3 | 92.2 |

p70 | 5 | 36.9 | <0.0001 | 7.1 | 92.2 |

p10 | 3 | 34.8 | <0.0001 | 6.7 | 98.9 |

p20 | 2 | 4.59 | <0.0320 | 0.9 | 9.8 |

The influence of each process was calculated by analyzing other fault factors using the afore-mentioned method, and the fault-suspected processes were selected in descending order of their effects. Thus, the processes that had large effects on each fault factor were extracted, as presented in Table 3.

The chi-square values of the factors were calculated, and the processes that had large chi-square values with P values lower than 0.05 were selected (three processes with 70% of the total influence were selected). The processes that had large effects on three or more fault factors were Chemical Cu Plating (p30), p60, Plating (p70), and p90. In addition, processes p10, p40, and p80 have no effect at all, and processes p20, p50, and p100 affect one or two factors. Therefore, it is expected that managing the four processes (p30, p60, p70, and p90) commonly affecting all the faults would be highly effective. Next, an analysis to identify faulty equipment was performed. This would determine the equipment that affected the processes influencing each fault factor the most.

Table 3.

Process | Factors | |||||
---|---|---|---|---|---|---|

F1 | F3 | F4 | F5 | F2 | F6 | |

p10 | ||||||

p20 | 3 | |||||

p30 | ⚫1 | ⚫3 | ⚫1 | |||

p40 | ||||||

p50 | 1 | |||||

p60 | ⚫2 | ⚫2 | ⚫2 | ⚫3 | ||

p70 | ⚫2 | ⚫3 | ⚫1 | ⚫2 | ||

p80 | ||||||

p90 | ⚫2 | ⚫3 | ⚫3 | ⚫1 | ||

p100 | 1 | 3 |

⚫=processes that affected three or more fault factors in common, =processes that only affected specific fault types, 1,2,3=process (priority) with a large cumulative impact by fault type.

To identify the fault-suspected equipment, the processes that were performed by two or more pieces of equipment were analyzed. The processes were selected by using the stepwise selection method described in Section 4.2. The coefficient of each independent variable (equipment) in the logistic regression model was estimated using the maximum likelihood method. The equipment with a smaller estimated value has a smaller defect probability. In other words, when there are multiple pieces of equipment in the same process, the equipment with a small estimated value of the independent variable is considered as normal equipment, whereas that with a large estimated value is considered as abnormal equipment. In the case of F2, analyzed in Section 4.2, the results for the fault-suspected equipment are outlined in Table 4 (the reference equipment for each process is excluded in this table).

When the estimated coefficient of each independent variable (equipment) in Table 4 is substituted into the regression equation, Eq. (3), the following equation is obtained:

Table 4.

Analysis of maximum likelihood estimate | ||||||
---|---|---|---|---|---|---|

Parameter | Equipment | DF | Estimate | Standard error | [TeX:] $$X^{2}$$ | Pr > [TeX:] $$X^{2}$$ |

intercept | - | 1 | -4.430 | 0.429 | 106.61 | <0.0001 |

p10 | f11 | 1 | -0.065 | 0.034 | 3.66 | 0.0557 |

f12 | 1 | 0.113 | 0.022 | 25.94 | <0.0001 | |

p20 | f21 | 1 | 0.029 | 0.013 | 4.59 | 0.0320 |

p30 | f31 | 1 | 0.046 | 0.041 | 1.26 | 0.2603 |

f32 | 1 | -0.245 | 0.059 | 16.95 | <0.0001 | |

f33 | 1 | 0.056 | 0.145 | 0.14 | 0.7009 | |

f34 | 1 | 0.155 | 0.048 | 10.42 | 0.0012 | |

p40 | f41 | 1 | 0.311 | 0.067 | 21.50 | <0.0001 |

f42 | 1 | 0.192 | 0.078 | 6.28 | 0.0121 | |

f43 | 1 | -0.119 | 0.044 | 7.20 | 0.0073 | |

p50 | f51 | 1 | -0.178 | 0.019 | 80.67 | <0.0001 |

f52 | 1 | -0.090 | 0.030 | 8.98 | 0.0027 | |

p60 | f61 | 1 | -0.351 | 0.034 | 100.40 | <0.0001 |

f62 | 1 | 0.029 | 0.024 | 1.53 | 0.2153 | |

p70 | f71 | 1 | 0.029 | 0.044 | 0.45 | 0.5018 |

f72 | 1 | 0.124 | 0.030 | 16.98 | <0.0001 | |

f73 | 1 | -0.097 | 0.030 | 10.52 | 0.0012 | |

f74 | 1 | -0.084 | 0.023 | 13.14 | 0.0003 | |

p90 | f91 | 1 | 0.017 | 0.086 | 0.03 | 0.8431 |

f92 | 1 | -0.162 | 0.047 | 11.84 | 0.0006 |

For the processes with large influence on each fault factor, normal equipment was distinguished from abnormal equipment based on the analysis results presented in Section 4.3. Specifically, the three processes with large cumulative influence that were selected in Section 4.2 were distinguished once more by normal and abnormal equipment. Table 5 outlines the results (based on Tables 3 and 4) for the identification of the fault-suspected process and equipment for each fault factor. Further, the pieces of equipment that had large influence were derived for each process. The processes that commonly had an influence on all the fault factors were p30, p60, p70, and p90. The normal equipment and abnormal equipment for each process are listed in Table 5. As shown in this table, in p30, equipment f32 was good for F1 and F4.

Hence, this equipment had no influence on F1 and F4. Similarly, in p70, equipment f72 had no influence on F4, F5, and F6. However, a few pieces of equipment showed conflicting results. For example, in the case of p90, equipment f91 had no influence on F4 and F6, but it influenced factor F5. This situation occurred because the fault factors had mutually incompatible characteristics. For example, when the connection thickness of the PBGA increased, the open circuit defects decreased, but the short circuit defects increased.

To identify the process that had the largest influence on the fault factors, the data for process factors (line width, plating thickness, etc.) managed in real manufacturing sites were analyzed. The results confirmed that the processes indeed influenced the corresponding fault factors. Moreover, in the plating process, it was theoretically verified that the process factor value was changed, and the fault factors were influenced by the equipment condition. Further, while it is very difficult to distinguish normal equipment from abnormal equipment through direct experiments, additional verification by analyzing the parameters of each equipment is needed because process factors change and influence fault factors depending on the equipment condition and status.

Table 5.

Fault factors | Equipment | Suspected process and equipment | |||||
---|---|---|---|---|---|---|---|

p20 | p30 | p50 | p60 | p70 | p90 | ||

F1 | Normal | f22 | f32 | f75 | |||

Abnormal | f21 | f33 | f71 | ||||

F2 | Normal | f51 | f61 | f92 | |||

Abnormal | f53 | f63 | f93 | ||||

F3 | Normal | f35 | f62 | ||||

Abnormal | f31 | f63 | |||||

F4 | Normal | f32 | f72 | f91 | |||

Abnormal | f34 | f75 | f93 | ||||

F5 | Normal | f63 | f72 | f93 | |||

Abnormal | f61 | f73 | f91 | ||||

F6 | Normal | f63 | f72 | f91 | |||

Abnormal | f61 | f73 | f92 |

This study was conducted to ensure manufacturing competitiveness by improving the yields and productivity of the PBGA manufacturing process. To achieve these improvements, the processes and equipment that affect the yield of the current widely distributed PBGA products were analyzed. To this end, the data on the defects and equipment variables in the PBGA manufacturing process were analyzed to identify the processes that affect the yield. This paper also proposes a technique to analyze the processes to determine the equipment that has the most adverse effects. The processes and equipment that are classified as critical factors need to be managed intensively with input from field engineers. In addition, further advanced studies on factor selection may be conducted, focusing on the analyses of equipment variables.

If additional key parameters (linear values) of the equipment are analyzed together, the use of mutual information feature selection may be considered because nonlinear factors are also present.

Research on the methods of collecting equipment data in the PCB production process and utilizing them for productivity improvement has been limited. Thus, new data analysis techniques that consider the actual process environment must be developed.

He received Ph.D. degree in Information & Industrial Engineering from Yonsei University in Seoul, Korea. He is now professor in the Department of Industrial & Management Engineering at Kyonggi University, Korea. He is also an Editorial committee of Journal of the Semiconductor & Display Technology. Dr. Sim worked as a group leader for Samsung Electronics Co.

- 1 D. H. Lee, J. K. Y ang, C. H. Lee, K. J. Kim, "A data-driven approach to selection of critical process steps in the semiconductor manufacturing process considering missing and imbalanced data,"
*Journal of Manufacturing Systems*, vol. 52(Part A), pp. 146-156, 2019.custom:[[[-]]] - 2
*G. Gasso, 2019 (Online). Available:*, https://moodle.insa-rouen.fr/pluginfile.php/7984/mod_resource/content/6/Parties_1_et_3_DM/RegLog_Eng.pdf - 3 G. A. Cherry, S. J. Qin, "Multiblock principal component analysis based on a combined index for semiconductor fault detection and diagnosis,"
*IEEE Transactions on Semiconductor Manufacturing*, vol. 19, no. 2, pp. 159-172, 2006.custom:[[[-]]] - 4 L. Yan, "A PCA-based PCM data analyzing method for diagnosing process failures,"
*IEEE Transactions on Semiconductor Manufacturing*, vol. 19, no. 4, pp. 404-410, 2006.custom:[[[-]]] - 5 B. E. Goodlin, D. S. Boning, H. H. Sawin, B. M. Wise, "simultaneous fault detection and classification for semiconductor manufacturing tools,"
*Journal of the Electrochemical Society*, vol. 150, no. 12, pp. 778-784, 2003.custom:[[[-]]] - 6 K. B. Lee, S. Cheon, C. O. Kim, "A convolutional neural network for fault classification and diagnosis in semiconductor manufacturing processes,"
*IEEE Transactions on Semiconductor Manufacturing*, vol. 30, no. 2, pp. 135-142, 2017.custom:[[[-]]] - 7 J. Arkes,
*Regression Analysis: A Practical Introduction*, UK: Routledge, Oxon, 2019.custom:[[[-]]] - 8 M. Pal, P. Bharati, "Introduction to correlation and linear regression analysis,"
*in Applications of Regression Techniques. Singapore: Springer*, pp. 1-18, 2019.custom:[[[-]]] - 9 C. H. Cheon,
*Data Mining Techniques*, Korea: Hannarae Publishing, Seoul, 2015.custom:[[[-]]] - 10 S. Lee, "Logistic regression procedure using penalized maximum likelihood estimation for differential item functioning,"
*Journal of Educational Measurement*, vol. 57, no. 3, pp. 443-457, 2020.custom:[[[-]]]