UroComp Home
Go to UroComp home

Neural Network for Detection of IVF/ICSI Outcomes with Surgically Retrieved Sperm

Portions of this document were adapted in part from our paper:

Computational models for prediction of IVF/ICSI outcomes with surgically retrieved spermatozoa. M Wald, AET Sparks, J Sandlow, B Van Voorhis, CH Syrop, and CS Niederberger, Reproductive BioMedicine Online, September 2005.

The reader is strongly encouraged to read this paper prior to using the network, as it contains a more specific description of the model.


About IVF/ICSI outcomes with surgically retrieved spermatozoa

The introduction of in-vitro fertilization (IVF) and intra-cytoplasmic sperm injection (ICSI), along with the development of advanced procedures for surgical sperm retrieval have revolutionized the treatment of severe male factor infertility. However, the effects of the anatomical site for sperm retrieval, etiology of azoospermia and sperm cryopreservation on IVF/ICSI outcomes are still controversial. While similar IVF/ICSI pregnancy outcome were reported using epididymal or testicular sperm in azoospermic patients with similar etiology, the data comparing ICSI outcomes between patients with obstructive and non-obstructive azoospermia is less consistent. Although some authors suggest no difference, the majority of reports show significantly impaired fertilization or pregnancy outcome in IVF/ICSI cycles using surgically retrieved testicular sperm from men with non-obstructive azoospermia (NOA) compared with testicular sperm from obstructive azoospermia (OA) patients.

While significantly lower fertilization rates have been reported for cryopreserved testicular sperm, the majority of reports suggest no significant worsening in outcome with the use of cryopreserved surgically retrieved sperm. However, most of these studies investigated a combination of men with obstructive and non-obstructive azoospermia, without comparing the effect of cryopreservation by etiology of infertility. Comparison of fresh and frozen-thawed testicular sperm in terms of IVF/ICSI outcome when sperm was obtained from OA patients only has been limited by the small numbers investigated. However, a similar comparative analysis of fresh and cryopreserved testicular sperm from only men with NOA showed no significant differences in either fertilization or pregnancy outcome.

Given the currently available data and controversies concerning the use of surgically retrieved sperm for IVF/ICSI, a tool which predicts the outcomes of IVF/ICSI with surgically retrieved sperm in various clinical settings, involving variable combinations of related factors, may be of clinical merit. A neural network may accomplish this task. Further statistical analysis of the neural network may determine the significance of the evaluated clinical factors to the model’s outcome .

Return to top of document

Neural Network Programming and Training

Clinical data was collected from 218 men which served as the dataset for the analysis. This dataset was modeled using, “neUROn2++”, a set of C++ programs designed to implement neurological and statistical algorithms, which are cross compiled using Microsoft Visual C++ version 6 and GNU C++ (Cygwin port version 2.95).

Clinical data collected from 85 women who had undergone 113 IVF/ICSI cycles, primarily for male factor infertility, were used to construct the dataset for the study. This dataset was modeled using "neUROn2++", a suite of C++ programs we designed to implement neural computational and statistical algorithms, which are cross compiled using Microsoft Visual C++ version 6 and GNU C++ (Cygwin port version 2.95).

Maternal age, type of sperm retrieval technique (testicular sperm extraction, testicular sperm aspiration and microscopic epididymal sperm aspiration), type of sperm used (cryopreserved or “fresh”), and the type of male factor infertility (previous vasectomy, with and without reversal; congenital bilateral absence of the vas deferens; other obstructive conditions; non-obstructive azoospermia; varicocele; and other conditions) were encoded as the input nodes to the neural network. The output node represented intra-uterine pregnancies achieved through IVF/ICSI. Sperm retrieval, type of male factor and intra-uterine pregnancy parameters were assigned either 1, if present for each exemplar, or 0, if not. For sperm type, cryopreserved sperm was assigned 1, while “fresh” sperm was assigned 0.

The dataset was randomized into a modeling (“training”) set of 83 exemplars, with a separate, completely independent cross-validation, “test”, set of 30 exemplars. The test set was excluded from training, and only used for cross-validation (n1/n2 method). A 1-hidden node layer with 4 nodes was determined to represent an optimal network architecture which maintained acceptable goodness-of-fit without overlearning. The training method was canonical off-line backpropagation with weight decay, with the weight decay term lambda chosen to be 5e-05.1 The network to be trained to completion when the error was observed to be oscillating at a local error minimum.

Wilk's Generalized Likelihood Ratio Test (GLRT) was used to determine which input features were significant to the model's outcome in a reverse regression analysis.1 Maternal age was found to be most significant (p=0.025) in predicting pregnancy outcomes, followed by sperm type (p=0.076). Type of male factor (p=0.47) and sperm retrieval technique (p=0.88) were not found to be significant predictors. The dataset was also modeled using logistic regression and linear and quadratic discriminant function analysis (LDFA and QDFA). Model accuracies, in terms of receiver operator characteristic curve (ROC) area, were higher with the nonlinear computational method of neural computation than those achieved by the traditional linear statistical modeling tools.

Return to top of document

Accuracy of the Neural Network Compared to Linear Methods

Training Set ROC Area1 Test Set ROC Area2
Neural Network
Logistic Regression
Quadratic Discriminant Function Analysis
Linear Discriminant Function Analysis

  1. A description of this method of training, including weight decay and feature extraction using Wilk’s GLRT, may be found in Golden RM, Mathematical methods for neural network analysis and design, Cambridge, MA: MIT Press, 1996.

  2. Receiver Operating Characteristic Curve area. As numbers approach 1.0, the accuracy of the statistical method improves: a ROC value of 1.0 would indicate a sensitivity of 1.0 and specificity 1.0. ROC areas were computed using the statistical method described by Wickens: Wickens TD, Elementary signal detection theory, New York: Oxford University Press, 2002.

Return to top of document

Click here to download IVF.prc, a PalmOS application for the PalmPilot or Handspring Visor device. (If you are using Netscape Navigator, and have trouble with this link, try clicking on it with the right mouse button, and choosing "Save Link As..." from the pop-up menu.)


Return to top of document