Your chair example, and your comment on this answer, invokes the way brains represent objects.
Very briefly, neuroscience models this as "neural representational spaces". A representational space is an abstract high-dimensional space wherein each point represents an object. Objects that are similar are closer together in the space, i.e. the distance between their points is small. The dimensions of the space are properties of the object (like colour) which our sensory systems have learned to extract because those features are good for discriminating the objects in question - hence the alternative name "feature spaces".
Face space is the biggest example, with dimensions possibly encoding things like face width, height, inter-eye distance etc. Basically those visual features that help us distinguish people from each other, so not e.g. nose position which is the same for everybody. We probably have a "chair space" too, though it won't be as developed (evolutionarily nor from learning) as face space.
Your question then, is how many distinct points can we get into such a feature space, summed over all such spaces in a brain, summed over all possible brains.
(In reality there are multiple face spaces so each face is represented multiple times. We have distinct face spaces for person recognition vs emotion detection, static vs dynamic facial information... Brains also stack perceptual representations into hierarchical streams so that spaces representing individual facial features are upstream of the "whole-face spaces" I've described, which in turn output to higher "whole-person" spaces etc. A single "thought" - viewing or imagining a face in this example - therefore strings together activation across several representational spaces, dependent on the specific perceptual task. I'll ignore this for simplicity)
I guess this is when considerations of finite vs infinite, integers vs reals, and "minimum lengths" like Planck come in. Take a point in one of these neural representational spaces and consider its coordinates: are they discrete or continuous valued? (for example)
Regarding the "implementation" of such spaces, dimensions are often equated to neurons, so the value of a coordinate is a neural firing rate. If you've got 10 neurons then at most you can have a 10D feature space (so you'd better hope it doesn't take many to discriminate the objects in question). In reality there is redundancy and the brain is a fuzzy statistical machine, so the literal mapping of feature -> neuron is not so clean.
Anyway regarding neural firing rates: they can't fire infinitely fast, but I don't think there's any restriction on the number of distinct values it can take between min and max. Neurons are biological rather than digital so just as a result of noise/imperfection, there will be diversity in the firing rates of a neuron. It's regarded as continuous-valued so far as I'm aware.
That said, given the noisiness of such representations it cannot be that literally every 'distinct' point in the space (i.e. comprising different sets of firing rates; coordinates) is taken to represent a distinct chair or face - people's identities would seem constantly changing! Instead we probably recognise a face via closest proximity of its representation in face space to said representations of faces from memory.
Regarding "chair space" and the "colour" feature specifically, this is akin to saying there's a minimum difference in colour below which you won't be able to distinguish two [otherwise identical] chairs. This is the just-noticeable difference of psychophysics, which is what you're looking for wrt consideration of 'perceptual fidelity', but note that it's defined as a difference between two stimuli. The idea is not individual stimuli spaced over a range (the feature dimension), but only pairs of stimuli compared. I understand this is an in-principle experimental constraint: much like voltage, we can't ask objective questions about single stimuli but rather only in comparison to a reference.
Returning to facial recognition via proximity in face space, we are worse at this for "kinds" of faces we've seen little of before*.
*I've linked the general effect but it happens at the visual level for face perception too: we're more prone to misidentifying individuals of races we didn't grow up with, and might even say "they all look the same" if we're racist
It seems learning improves 'fidelity' of regions of representational space, and different brains have undergone different learning. You could probably learn to distinguish chair colours better than currently. Then, if we show you two photos of a chair, whereas before they might activate the same "thought" (percept) in as much as you can't distinguish them, afterward you can, so those chairs are now represented by 'sufficiently different' points in your brain's chair space.
Given the brain's amazing potential for learning, and the fact your question is talking about all possible thoughts, I think the potential number is infinite. Maybe one brain at one snapshot in time could only have a finite number of distinguishable thoughts as defined above. If you instead define it by physical brain states e.g. as measured via electrodes, then those measures are continuous valued so the number of those is infinite (like the number of reals between 1 and 2 is infinite). But that feels a bit like saying the number of values a CPU transistor can take is infinite, because although we only read out 1 or 0, the actual voltages fluctuate.
I've not yet proofread this and wrote it from memory having left the field years ago