The Kepler Conference
A Research Revolution 
Jan. 25 - 28, 2018
Evidence-Based Astrology for the 21st Century
On Florida's Beautiful Space Coast

Timing & Human Performance
 Advanced Applications for Business 




Kyösti Tarvainen
PhD, Docent in Systems Analysis
Oct.30, 2015

Dr. Kyosti Tarvainen has obtained some highly significant results in recent years, 
many of which have been regularly peer-reviewed & published in Correlation.
Dr. Tarvainen wrote the following essay especially for the Kepler Conference participants.  We want to strongly encourage researchers to learn from his methods, and try to replicate and expand upon these important results. 

In this article, I will report on my experiences using computer-generated control groups to refine and substantiate research results in:  1) Natal astrology,  2) Synastry  &  3) Prediction methods.

  1. Control groups for natal charts

In statistical studies, it is important to use control groups to check whether a result has arisen merely by chance; whether due to the irregular movements of planets, or uneven seasonal and daily variations of births, or even a specific yearly distribution of births in the collected data. For example, French astrologer, Leon Lasson observed that Mercury was often near the Ascendant in writers’ charts. However, his successor, Michel Gauquelin, showed that Mercury is often close to the Ascendant in any sample of birth charts, due to the fact that many births occur near sunrise, and Mercury is close to the Sun.

When preparing my research on the statements in the Sakoian and Acker handbook, [6], I compared various control-group methods, as presented by O’Neill (1995), Ertel (1995), Ruis (2007) and in the AstroDatabank (2005). Computer simulations showed that shuffling (also called switching, randomization or permutation) is an appropriate method (cf. [6]). I have used it in all my statistical studies dealing with natal charts. While other methods could have been used, that might even be better, shuffling still seems the one method most generally applicable to this type of research. That's why it is already programmed into the AstroDatabank (2005) (the research part of AstroDatabank may not yet be available due to change of ownership). The JigSaw program also includes shuffling.

In shuffling, a control group is formed using the same data items as the real data set, but the date (day and month together), year, hour and place (longitude and latitude together) are taken in random order. Therefore, the distributions of the data items are reproduced; for example, the daily birth rate distribution that was a critical issue in Lasson’s study.  It is usually appropriate to use local mean times, as suggested by O’Neill (1995). To reduce random variations, several shuffled data groups can be compiled into the total control group (using present-day computers, this shuffling can be repeated 500-1000 times for data sets with less than 1000 charts). 

Shuffling is a conservative method: in other words, if it shows a difference between the data and control group, the real difference is probably bigger. To see this, consider the hypothetical unreal situation where all subjects in a given data set were born at noon. Then all the control charts would also be ‘born’ at noon, and this special situation in the data would not be observed. (This is an extreme example; in reality the astrological effects are relatively small and generally the data seems random.)

Another potential problem is that in a control group made by shuffling, the control birth dates (day and month) can be too similar to the ones in the data set . In other words, the Suns' positions are about the same, so we don’t get to fully see how the the people in the data set deviate from the random or expected frequencies.  For that reason, it is sometimes useful to perturbate the birth dates randomly, for example, by 90 days at the most, so as not to disrupt the seasonal birth rhythm. In AstroDatabank (2005), there is a corresponding option, for this very reason.

Figure 1. Birth times of 9 writers born in 1909 (big dots) and 

9 000 control births in 1909 (small dots) generated by shuffling.

Figure 1 illustrates how the shuffling method generates vast representative amounts of control births (small dots) to real writers (big dots.) 

 In 1909, nine writers were born.  The big dots are the birth times of the data’s nine writers.  The birth times are given in a two dimensional way: the horizontal axis indicates the year’s day (given by the day number ranging from 1 to 365) and the vertical axis indicates the birth hour.  For example, the second big dot from the left relates to an author who was born March 31 (the 90th day in this year) in 1909, at 9.50 p.m.

The data set under consideration consists of 1352 writers, and the shuffling has been done 1000 times. The number 1000 was selected on the following basis: Shuffling the 1000 replications several times doesn't significantly change the derived results. As the shuffling reproduces the yearly distribution, the total control group includes 9,000 persons born in 1909. These 9,000 control births are shown by small dots in Figure 1. 

In Figure 1, you can discern that, in 1909, there are more control births in the mornings than evenings reflecting the hourly birth rate of all 1352 writers. One may also just discern that there are more control births during the first half part of the year than later, which reflects the seasonal distribution of all writers.

2.  Control groups for Synastry

    Three different methods of forming control couples in synastry are compared in [12]. Figure 2 illustrates the method which turned out to work best. The real couples are compared to randomly generated control couples. In Figure 2, the wife of a data family is born in 1921 and her husband in 1925. A control wife related to the data wife is generated in her birth year or adjacent years. The birth year is selected randomly among these three years with the same probability (the birth rate does not generally vary much from year to year). The birth date (day and month) is drawn among the dates of all wives in the data (the seasonal distribution is reproduced). Another wife is randomly selected among all wives, and her birth hour is taken (the hourly birth distribution is reproduced).

    Similarly, a control birth is generated inside three years related to her husband. These two births form a control family for this data family. Points for ten control families are depicted in Figure 2. In computer runs, 100 control families were usually generated for each data family in [7], where the data consists of 20,895 couples.

    Figure 2. Generating control families for a data couple where the wife is born in 1921 and her husband in 1925.

    As a kind of extension to the Sun sign synastry, I studied [7] whether there is an excess of conjunctions and trines of “beneficial” and “neutral” planets and points (SO, MO, ME, VE, JU, AS and MC) across the charts of 20,895 couples. This turned to be true in a statistically significant way.

    Another major theme in classical synastry is the placement of one’s planets in the partner’s houses. Sakoian and Acker (1976) state that the Sun’s placement in the partner’s 1st, 5th or 7th house is favorable for romantic relationships. As we see in Figure 3, in all these three houses there is an excess of the Sun, when the Koch, Placidus or Equal House system is used.

    Figure 3. The excess of one’s Sun in the spouse’s houses when using four different house systems (Koch, Placidus, Equal, Whole Sign). The question under consideration is how often one partner's Sun is in a house of the other partner’s chart (and vice versa). The excesses are obtained by comparing the data families to control families obtained by the method illustrated in Figure 2.

    In Figure 3, one sees that the Whole Sign House system doesn’t work as well as the other systems. In fact, we can, as explained in [7], use the p-value as a measure of the goodness of a house system. The house systems under consideration obtained the following ranking: Koch, Equal, Campanus, Regiomontanus, Alcabitius, Porfyry, Placidus, Solar, Whole Sign, Morinus, Meridian, Midheaven, Horizon. Naturally this is only one study with a limited point of view, but the study shows that statistical methods can be used also to study the technical issues of astrology.

    Besides these classical synastry methods, also composite charts and Davison charts were studied in [7]. In both methods, a new chart is determined based on the birth data of the two partners. The Sun also in this common chart would be expected to situate often in the 1st, 5th or 7th house. But this was clearly not the case. This casts doubts on the workings of these two modern synastry methods – especially since the size of data was so big that the classical synastry obtained very clear support.

    3.  Control groups for prediction methods

    The most useful way of generating control charts for prediction methods (for example, transits, solar arcs, progressions, solar returns) is very simple. One first reckons, chart by chart, how many astrological hits there are for the considered events (for example, for a professional advancement, transiting Jupiter at MC or another beneficial astrological situation). Then one considers alternative birth hours for each chart, and records the number of hits in these control charts. If when using the real birth time, the total number of hits is greater than that for the alternative birth hours, this is a good indication that the considered prediction method works. Such a method could also be used in rectification.

    This procedure was used by Bert Terpstra for (day/year) progressions. He considered the life of astrologer and writer Graham Greene, for whom many events were recorded. The research results are reviewed on the webpage Research Results (91 Abstracts) (Birth Time Reconstruction). Terpstra did not find support for the working of progressions.

    If the birth chart describes the person himself, then I suspect that one reason for this failure could be that many events were taken into account that did not depend on the actions, feelings, initiatives of Graham Greene himself, but on outer circumstances and other persons.  Furthermore, there may be a delay between the personal decision and realization. For example, when I decided to apply for studies abroad, transiting Uranus was on my Ascendant, but when the visible move happened one year later, Uranus was no longer conjunct my Ascendant. 

    In [9], a more complicated prediction case was taken under consideration.  The question was whether there are beneficial transits, solar arcs or progressions before the conception which positively influenced parents’ willingness to have a baby. Since the time from starting to have this willingness to the realization of the conception varies, it was necessary to consider the astrological factors during the months before the conception.

    Figure 4 presents the number of soft transit aspects before and after the estimated conception (which was assumed to happen nine months before the child’s birth). This case is also complicated due to the fact that there are natural fluctuations of the total number of these aspects. The general trends of this total number are delineated by the smooth curve in Figure 4. Above it, during about nine months before the estimated conceptions, we see an excess of soft transits which may have contributed to positive feelings towards having a child. (There is also an excess about 25 months before the conceptions: perhaps other pleasant things, as for example, getting married, happened then.)

    Figure 4. The sum of soft transit aspects in 69,969 parents’ charts before and after the estimated conception (the zero point in the x-axis). Conjunctions, sextiles and trines (one degree orb) of transiting Jupiter, Saturn, Uranus, Neptune and Pluto to the parents’ ten planets, Ascendant and MC are taken into account. The six points indicated by big points refer to this sum one to six months before conception. The smooth curve is a fifth order polynomial which has been fitted to the sum curve by the least-square sum method.

    In this case, the control groups were formed by generating mothers and children (details in [9]), and then comparing how often the excess of transits before the conceptions was as big as in Figure 4. This probability (p-value) was very small (0.005) giving support for the workings of transits. Solar arcs (Naibod) seemed to work too, but not (year/day) progressions.

    It was also possible to estimate the orbs of transits and solar arcs by changing the maximum orb used and observing when the excess before the conception is biggest  With too small orbs, there are few active aspects; with too large orbs, there are too many active aspects and the relative excess flattens out. The maximum excess occurred when the maximum orb used was 1° for transits and solar arcs, but about 2° for the transiting Sun and Moon. These estimations thus gave confirmation to the orbs usually recommended by astrologers. In studies [8], [10] and [11], confirmations were observed for major aspects’ orb sizes 6°-9°, which are the orbs usually recommended for natal charts. Recommendations for inter-chart aspects in synastry vary much in astrological literature. In [7] and [12], rather big orbs 9°-12° were estimated for synastry aspects.