backup_log_dir_for_component_Dgraph2 failed in bas

2019-03-04 14:50发布

问题:

During baseline update, I am getting error as backup_log_dir_for_component_Dgraph2 failed.

1. Below is the error from baseline_update.out file

            Setting flag 'baseline_data_ready' in the EAC.
                1 file(s) moved.
        [06.28.16 05:26:02] INFO: Checking definition from AppConfig.xml against existing EAC provisioning.
        [06.28.16 05:26:02] INFO: Definition has not changed.
        [06.28.16 05:26:02] INFO: Starting baseline update script.
        [06.28.16 05:26:02] INFO: Acquired lock 'update_lock'.
.
.
.
more logs in between
.
.
.
        [06.28.16 05:26:17] INFO: [ITLHost] Starting component 'Forge'.
        [06.28.16 05:45:14] INFO: [ITLHost] Starting backup utility 'backup_log_dir_for_component_Dgidx'.
        [06.28.16 05:45:15] INFO: [ITLHost] Starting component 'Dgidx'.
        [06.28.16 06:00:59] INFO: [MDEXHost] Starting shell utility 'cleanDir_local-dgraph-input'.
        [06.28.16 06:01:01] INFO: [MDEXHost] Starting shell utility 'rmdir_dgraph-input-old'.
        [06.28.16 06:01:03] INFO: [MDEXHost] Starting copy utility 'copy_index_to_host_MDEXHost'.
        [06.28.16 06:01:26] INFO: Applying index to dgraphs in restart group 'A'.
        [06.28.16 06:01:26] INFO: [MDEXHost] Starting shell utility 'mkpath_dgraph-input-new'.
        [06.28.16 06:01:27] INFO: [MDEXHost] Starting copy utility 'copy_index_to_temp_new_dgraph_input_dir_for_Dgraph1'.
        [06.28.16 06:01:59] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input_to_dgraph-input-old'.
        [06.28.16 06:02:01] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input-new_to_dgraph-input'.
        [06.28.16 06:02:02] INFO: [MDEXHost] Starting backup utility 'backup_log_dir_for_component_Dgraph1'.
        [06.28.16 06:02:03] INFO: [MDEXHost] Starting component 'Dgraph1'.
        [06.28.16 06:02:10] INFO: [MDEXHost] Starting shell utility 'rmdir_dgraph-input-old'.
        [06.28.16 06:02:12] INFO: Applying index to dgraphs in restart group 'B'.
        [06.28.16 06:02:12] INFO: [MDEXHost] Starting shell utility 'mkpath_dgraph-input-new'.
        [06.28.16 06:02:13] INFO: [MDEXHost] Starting copy utility 'copy_index_to_temp_new_dgraph_input_dir_for_Dgraph2'.
        [06.28.16 06:02:38] INFO: Stopping component 'Dgraph2'.
        [06.28.16 06:02:39] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input_to_dgraph-input-old'.
        [06.28.16 06:02:40] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input-new_to_dgraph-input'.
        [06.28.16 06:02:42]

     INFO: [MDEXHost] Starting backup utility 'backup_log_dir_for_component_Dgraph2'.
        [06.28.16 06:02:43] SEVERE: Utility 'backup_log_dir_for_component_Dgraph2' failed. Refer to utility logs in [ENDECA_CONF]/logs/archive on host MDEXHost.
        Occurred while executing line 5 of valid BeanShell script:
        [[
        2|      
        3|    DgraphCluster.cleanDirs();
        4|    DgraphCluster.copyIndexToDgraphServers();
        5|    DgraphCluster.applyIndex();
        6|     
        7|   
        ]]

        [06.28.16 06:02:43] SEVERE: Error executing valid BeanShell script.
        Occurred while executing line 35 of valid BeanShell script:
        [[
        32|        Dgidx.run();
        33|       
        34|        // distributed index, update Dgraphs
        35|        DistributeIndexAndApply.run();
        36|
        37|        // if Web Studio is integrated, update Web Studio with latest
        38|        // dimension values
        ]]

        [06.28.16 06:02:43] SEVERE: Caught an exception while invoking method 'run' on object 'BaselineUpdate'. Releasing locks.

2. Below is the error from backup_log_dir_for_component_Dgraph2.log file (Filepath PlatformServices\workspace\logs\archive)

Renaming G:\Endeca\MyEndecaApp\config\script\..\..\.\logs\dgraphs\Dgraph2 to G:\Endeca\MyEndecaApp\config\script\..\..\.\logs\dgraphs\Dgraph2.2016_06_28.06_02_42
Unable to rename G:\Endeca\MyEndecaApp\config\script\..\..\.\logs\dgraphs\Dgraph2 to G:\Endeca\MyEndecaApp\config\script\..\..\.\logs\dgraphs\Dgraph2.2016_06_28.06_02_42: Permission denied

I tried running the baseline update again and again, sometimes Dgraph1 fails and sometime Dgraph2. After failure the dgraph also stopped.


Edit 1: I have observed that when I stops both the dgraphs from workbench and then run baseline update, it always ran successfully. I tried this 4-5 times. We know baseline_update stops dgraph before doing backup of log folder. So I am assuming dgraph is not stopped properly before baseline_update do backup of log folder and so it generates error.

Please help me in resolving the issue. I am novice in Endeca Administration

Thank you

回答1:

There are a couple of scenarios that will cause the permission problem.

According to the Endeca Installation Documentation you should install Endeca as a specific user on the Windows Server. Lets assume that user is called 'endeca'. Did you make sure that the 'endeca' user is the current owner of the G:\Endeca\MyEndecaApp folder and subfolders? After specifying the 'owner' you also need to set the permissions on this folder as Full to the 'endeca' user. Are you running your Endeca Services as the 'endeca' user?

Assuming you've done the above and you still have an issue it can also happen based on how you start your index. If you kick off a baseline index from the CMD prompt, are you doing this as yourself, the 'endeca' user or 'Administrator'? Depending on who you ran the last index as will determine if you have permission to all the subsequent runs. I tend to do CMD line executions as 'Administrator' and have had very few permission problems.

Are you perhaps inspecting the log files in 'Notepad.exe'? It locks the file aggressively so you won't be able to rename the file, or the folder, if you have it open in 'Notepad'. Either make sure you don't have it open in 'Notepad' or rather use 'Notepad++' which doesn't lock the file.

Lastly I've also had issues where CMD prompt was open in the log folder that needs to be renamed. So make sure your CMD prompt is either closed or doesn't open your log folders.

Been running Endeca on Windows Server 2012 R2 for the last 3 years and those are the only issues I've had. If all else fails you can always try the sysinternals tools, in particular 'procmon.exe', but it will output a lot of information during the time you are building an index so be prepared for information overload.



回答2:

Changing the Dgraph property 'numIdleSecondsAfterStop' to 90 seconds from IAP workbench solved the problem.

It shows that the failure was due to the Dgraph was not properly stopped before rename and log folder was locked by Dgraph.

Setting 'numIdleSecondsAfterStop' causes the baseline to wait for 90 seconds to process next steps after Dgraph stops.



回答3:

The problem is very clear in logs, such a folder doesn't exists and so baseline process is unable to rename the folder Dgraph2. This usually occurs when the process fails in-between when the baseline update is running. Say for example you ran the process and the script clears a folder after taking back up of that contents and it fails. Again you might run the process from the first so when it tries to clear the same again, you usually get the error of missing folder. Simple solution is that create the missing folder Dgraph2 or when the update fails due to workbench down or anything. Then comment the app config script till it had successfully ran and run again from that particular instance. Hope this helps!



标签: endeca