Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13167

Duplicate Child Documents and undeterministic search

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.5
    • None
    • search, SolrCloud
    • None
    • SOLR 7.5 running on AWS EC2 Instances with an AMI OS split to two shards running on two different EC2 instances with the built in Zookeeper of SOLR

    Description

      i have a product search hosted on a solr cloud with 2 shards and two instances hosted on ec2 and the following setup: 

      a product has an unlimited amount of children which are small objects with shop information. these child documents of the products define the shops where the product is available. the requirement from my side is to update / sync the whole documents (parent and children) at least once a day. the availability information is included in the child-documents with a quantity field.

      problem:

      1. after every sync the number of child documents (shops) increases and nests deeper every sync as the quantity changes and the child documents are apparently not updated by id but newly created with the same id (document-duplicates as comparable in SOLR-5211, SOLR-6096, SOLR-12638). 
      2. whenever i sync the products with the children with one level of depth (parent > child) i get parent > child > child > child > ... depending on how many children there are (see screenshot-4.png). these children also can't be displayed with nodeType:shop
      3. whenever i try to request the products (parents) by a child attribute (shopId) the search is underteministic and does not return the correct products. a lot of products do contain children that never have been assigned to them. some products are flooded with a huuge amount of children (>1000) although they have assigned about 10. as you can see in screenshot-1 to 3 there are three queries that are exactly the same and give back different products. screenshot-1 with 26241 results would be the correct amount and correct data but the other two are completely wrong. 

      i would really appreciate any workaround or help on these issues. this is a huge problem and my business does depend on this 

       

      Attachments

        1. screenshot-4.png
          156 kB
          Kevin Bachmann
        2. screenshot-3.png
          421 kB
          Kevin Bachmann
        3. screenshot-2.png
          418 kB
          Kevin Bachmann
        4. screenshot-1.png
          417 kB
          Kevin Bachmann

        Activity

          People

            Unassigned Unassigned
            kbax Kevin Bachmann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: