Say you want to calculate the probability that your theory T is correct given observations O. According to Bayes' Theorem, the probability of T given O is
P(T|O) = P(O|T) P(T) / P(O)
where P(O|T) is the probability of O given T, and P(T) is the prior probability of T without accounting for observations O. P(O) is the prior probability of O.
P(O|T) is the degree to which T predicts O. If T doesn't predict O then either (i) T predicts that O is unlikely/impossible, or (ii) T is uncorrelated with O.
Let's consider case (ii) where T is consistent with O but doesn't preferentially predict O.
If T is consistent but uncorrelated with O, then the probability of O given T is just the probability of O. That is P(O|T) = P(O). Consequently, P(T|O) = P(T). In this case, you cannot make any inference of T from O.
More intuitively, the probability that a theory T is inferred by observations O is proportional to the force with which T predicts O. If T doesn't preferentially predict O, you are not justified in making an inference from O to T.
For an Intelligent Design theory to be inferred from the data, it must be specific enough to predict something about the observations.
One objection to this conclusion would be to claim that an arbitrary T can always be fitted to O such that it preferentially predicts O. For example, one might claim that "T is the generic theory that there is an unknown agent that is responsible for O." However, the problem here is that T is no longer an inference from O. It is a paraphrasing of O. (After all, we originally set out to answer the question "what is the unknown agent responsible for O?")
So, how do we know whether or not T is just paraphrasing O? We can know this by counting parameters. If we have N data points, we need N parameters to paraphrase the data without making any inferences. For example, we can always write the next number in a sequence as the sum of the previous number and some parameter tuned to give us the correct answer.
Therefore, an inference is a theory T with fewer parameters in it than O. Since there are fewer parameters in T than in O, some proper subset of O must be predicted by T from the remainder of the data in O.
Generic ID makes no predictions, not even within the existing data we have. It has more free parameters than any amount of data we throw at it. It is, at best, a paraphrasing of the data.
We can certainly infer the action of intelligent agency from some data sets, but only when our intelligent agent theory is predictive in some way.