A total of 200 cows were enrolled over a 1.5-month period into a mixed-parity herd

As expected, the dendrograms pruned with the ensemble of simulations that accounted for both measurement error and longitudinal consistency of the underlying behavioral pattern produced encodings that were far more granular. A total of 13 clusters were returned for the unweighted Euclidean metric, 17 for KL distance, and 14 for both the noise-penalized and the plasticity-penalized dissimilarity metrics. In Figure 5B, the heatmap visualization of these pruning results for the plasticity-penalized dissimilarity metric reveal an encoding that is coarser but ultimately quite well balanced, with the pruning heights modulated to produce cluster sizes that were reasonably uniform across the domain of support. Closer inspection revealed that this final encoding largely matched the order of bifurcations in the original tree, except that this pruning strategy left no animals isolated in anomalous clusters. It should be noted, however, that the granularity of this encoding is not entirely intrinsic to this system, but was dependent on the size of the subsample used to calculate the overall time budget in each simulation. While we can expect cows that were more inconsistent in their daily time budgets to be subjected to a stronger penalty with this estimator due to relatively higher rates of sampling error imposed by the subsampling routine, we can also anticipate that the overall scale of the sampling error imposed on all cows should grow as the size of the subsample is reduced. This would in turn modulate how quickly the underlying behavioral signals would be drowned out by simulated noise within the tree. This suggest that, cannabis trim tray for larger samples where a greater range of subsample sizes can be utilized, this simulation value can also be treated as a meta-parameter to tune the granularity of the final encoding.

Given that the plasticity-penalized mimicry was created for this data set by subsampling only 14 out of 65 observation days, the resolution achieved in the pruned encodings for all four dissimilarity metrics reinforces that this herd was overall fairly consistent in their daily time budgets, and that this data set will support fairly detailed inferences against a strong underlying behavioral pattern.Encodings of the overall time budgets produced using both the noise and plasticity penalized dissimilarity estimators, wherein both were pruned using the more conservative plasticity-penalized ensemble, produced similar behavioral insights when compared against longitudinal patterns in parlor entry position across the herd. For the bivariate analyses run with encodings for all 177 cows with complete records, highly significant associations with entry order were recovered for both the noise-penalized and plasticity-penalized time budget encodings. The bivariate relationship was optimized for both time budget encodings with a five-cluster encodings of entry order patterns. The noise-penalized encoding produced the strongest associations with entry order with seven time budget clusters, whereas the plasticity-penalized encoding performed better with a finer encoding of nine clusters, the key difference being the degree of stratification among animals with the most moderate time budgets. Visualization of the contingency tables for the optimized encodings colored by their PMI estimates revealed that the significant overall association between the two data streams was driven predominantly by animals in the latter half of the milking queue. Figure 5 displays the results for the noise-penalized encoding. We see first that cows that entered consistently at the very rear of the queue were significantly over-represented in the time budget cluster characterized by moderate time spent eating, low time nonactive, and high rates of rumination . Cows that entered nearer the back of the queue , just ahead of the cows that consistently brought up the rear, were also over-represented in the same time budget cluster ± a trend that was statistically significant for the plasticity-penalized encoding but only marginally significant for the noise-penalized encodings.

In fact, very few animals that entered in the front half of the queue were found to have this time budget pattern, with cows entering just behind the leaders being significantly under-represented in this time budget cluster. One potential interpretation of this pattern might be that, if these cows are prioritizing time investments in rumination, then this strategy may include hanging back towards the later part of the queue, where they may be able to chew their cud while avoiding the more serious contention for parlor entry position. Further analysis that could facilitate visualization of the cyclical patterns in this time budget data would be needed, however, to confirm this suspicion, and will be left for future work.While this more moderate tradeoff between rumination and non-activity demonstrated a fairly straightforward and progressive trend across the milking queue, which might readily have been captured by a linear model, more complex dynamics were found for the time budget cluster characterized by extremely low time spent eating and high time spent ruminating and nonactive . Cows that consistently entered at the very end of the queue were significantly under-represented in this extreme time budget, while the cows that entered just ahead of them were significantly over-represented. While an extreme tradeoff in eating and ruminating might be explained by issues with sensor placement, that such cows are not evenly dispersed across the herd may instead indicate a biological driver. Health status naturally comes to mind with such an extreme time budget, and indeed several previous studies have reported higher rates of health complications amongst animals in the latter part of the milking queue , but health status alone would not necessarily explain the inversion in association pattern between these two adjacent queue groups.

Time budgets provide a convenient and intuitive means of quantitatively summarizing the behavioral tradeoffs of animals, but multinomial-distributed data present a number of analytical challenges. The results of this analytical case study have highlighted how a novel simulation based approach may be employed to simultaneously accommodate both the codependency structures fundamental to multivariate-distributed data formats and the complex multi-faceted sources of measurement uncertainty that may be encountered across a broader ranger of PLF data streams. While such simulations may be more computationally expensive than closed form estimators, we have demonstrated that an ensemble of data mimicries can be efficiently repurposed throughout the analytical pipeline to improve not only the visualization of these behavioral tradeoffs, but also the compression of such information into robust empirically defined discrete encodings. It should be noted, however, cannabis trimming tray that the utility of these novel clustering techniques are not restricted to time budget data. The ensemble-penalized dissimilarity estimator and ensemble cut algorithm that we have introduced in this case study are both fundamentally non-parametric. This means that their implementation is in no way intrinsically restricted to any particular class of data. Subsequently, the choices that a user makes in constructing an appropriate error simulation model are restricted only by their own creativity, allowing this analytical framework to be easily generalized to a much wider array of PLF data streams and the wider array of complex error structures that they have to offer. Additionally, while discrete data is typically seen as an impediment to statistical analysis in most model-based approaches, we hope that this analytical case study has served to demonstrate the comparable ease with which insights may be extracted from encoded data when an information theoretic approach is employed. For large, structurally complex, and often informationally redundant PLF data streams, an efficient encoding may be far easier to achieve than a comprehensive model that can fully accommodate the temporal dynamics of behavioral responses in dynamic farm environments. This may be especially true for data sets where all the factors driving such behavioral responses are not measurable. While more formal model-based inferences may be warranted for further analysis of the underlying causes of this relationship, the exploratory analysis tools provided by the LIT pipeline have undoubtably served to create a more comprehensive picture of the complex behavioral dynamics hiding within these two underutilized data streams.Precision Livestock Farming technologies create new opportunities to record the behaviors of large groups of socially housed animals in complex working farm environments. The datasets produced by such technologies, however, often contain measurements collected more frequently and over longer observation windows than is possible with direct human observation, resulting in complex structural features that in turn present new logistical challenges in extracting usable knowledge from these rich data streams . In previous work, we introduced the concept of ensemble mimicries, wherein minimally parametric simulation techniques can be integrated into standard hierarchical clustering pipelines in order to account for complex nested sources of measurement error often encountered in PLF applications. In analyses of overall time budget data, this not only allowed us to account for heterogeneous variance in sensor noise, but also to penalize animals whose behavioral responses were more variable over the observation window. This subsequently produced encodings that gave relatively more weight to cows with the most clearly defined behavioral responses that were pervasive across environmental contexts .

In previous work with milking order records extracted from metadata produced by RFID-equipped rotary milking parlors, however, we also demonstrated that interesting ethological insights can be gleaned by recovering systematic temporal variations in behavioral responses using a data mechanics clustering approach, even when the environmental factors eliciting changes in queueing patters were not recorded and only the behavioral artifacts remained in the response data . In light of these collective results, an algorithmic framework is needed to disaggregate overall time budgets to determine if variability that was previously penalized using ensemble mimicry approaches might in fact be hiding systematic heterogeneity in daily time budgets that might provide further insights into the behavioral responses of this herd of cows across management contexts. In previous work with entry order records, we showed that data mechanics can be conceptualized as a form of nonlinear PCA, and thus used as a discrete but more readily explainable alternative to manifold embedding algorithms . Its efficacy has been demonstrated for information compression for multivariate response data and for repeated measures of univariate response , but daily time budget records are a multivariate repeated measure. While it would be possible to flatten daily time budget records in order to analyzed this dataset as a 2D matrix using the standard data mechanics analysis pipeline, this would afford no control or differentiation in weighting the behavioral and temporal axes in the final encodings. Forcing the algorithm to learn, from the data itself, fundamental structural features that could easily be specified a priori, is inherently inefficient. This chapter will therefore explore how data mechanics algorithms can be extended, not only to incorporate information about measurement error from new ensemble mimicry techniques, but also to utilize tensor constructs to more naturally accommodate the complex temporal structures often encountered in PLF data streams. The result will be an algorithmic framework that will allow ethologist to leverage simultaneously the multivariate and temporal richness of similar PLF data streams, in order to provide an approach tomodel-free knowledge discovery that is not only more statistically powerful, but will also provide more wholistic ethological inferences.To explore the efficacy this proposed analytical framework, data was repurposed from a feed trial conducted on a working dairy farm assessing the efficacy of an organic fat supplement on cow health during early lactation . All animal handling and experimental protocols were approved by the Colorado State University Institution of Animal Care and Use Committee . The study was conducted in 2017, from January through July, on a USDA Certified Organic dairy in Northern Colorado. Cows were housed in a closed herd in an opensided free-stall barn that was stocked at roughly half capacity with respect to both feed bunk spaces and stalls, with free access to an adjacent outdoor dry lot. The grazing season began in April, at which point cows were moved onto pasture each night to comply with organic grazing standards. Cows had free access to TMR between milkings while in their home pen, and were head locked each morning to facilitate daily health checks and heat detection. For more details on feed trial protocols, the housing environment, and the day to day management of animals, see Manriquez et al. and Manriquez et al. . In order to facilitate direct comparisons with previously reported work, daily time budget records were calculated for this methods paper using the same CowManager ear tag accelerometer records used previously to analyze overall time budgets . Data scrubbing decisions used to exclude incomplete records were identical to those reported previously to calculate overall time budgets, with the only appreciable difference in data wrangling being that here hourly time budgets were aggregated conditional onday of observation.